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IMPROVED ADENOVIRUS VECTORS 
FIELD OF THE INVENTION 

The invention relates to improved adenovirus vectors, and more specifically, 
adenovirus vectors useful for gene therapy. 

BACKGROUND 

Adenoviruses (Ad) are double-stranded DNA viruses. The genome of adenoviruses 
(~j 6 kb) is complex and contains over 50 open reading frames (ORFs). These ORFs are 
overlapping and genes encoding one protein are often embedded within genes coding for other 
Ad proteins. Expression ot Ad genes is divided into an early and a late phase. Early genes 
are those transcribed prior to replication of the genome while late genes are transcribed after 
replication. The early genes comprise Ela. Elb. E2a. E2b. E3 and E4. The Ela sene 
products are involved in transcriptional regulation: the Elb gene products are involved in the 
shut-off of host cell functions and mRNA transport. E2a encodes the a DNA-binding protein 
(DBP): E2b encodes the viral DNA polymerase and preterminal protein (pTP). The E3 gene 
products are not essential for viral growth in cell culture. The E4 region encodes regulatory 
protein involved in transcriptional and post-transcriptional regulation of viral gene expression: 
a subset ot the E4 proteins are essential for viral growth. The products of the late genes (e.g., 
Ll-5) are predominantly components of the virion as well as proteins involved in the 
assembly of virions. The VA genes produce VA RNAs which block the host cell from 
shutting down viral protein synthesis. 

Adenoviruses or Ad vectors have been exploited for the delivery of foreign genes to 
cells for a number ot reasons including the fact that Ad vectors have been shown to be highly 
effective for the transfer ot genes into a wide variety of tissues in vivo and the fact that Ad 
infects both dividing and non-dividing cells: a number of tissues which are targets for gene 
therapy comprise largely non-dividing cells. 

The current generation of Ad vectors suffer from a number of limitations which 
preclude their widespread clinical use including: 1) immune detection and elimination of cells 
infected with Ad vectors. 2) a limited carrying capacity (about 8.5 kb) for the insertion of 
foreign genes and regulatory elements, and 3) low-level expression of Ad genes in cells 

infected with recombinant Ad vectors (generally, the expression of Ad proteins is toxic to 
cells). 
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The latter problem was thought to be solved by using vectors containing deletions in 
the El region of the Ad genome (El gene products are required for viral gene expression and 
replication). However, even with such vectors, low-level expression ot Ad genes is observed. 
It is now thought that most mammaiian cells contain El -like factors which can substitute for 
the missing Ad El proteins and permit expression of Ad genes remaining on the El deleted 
vectors. 



What is needed is an approach that overcomes the problem of low level expression of 
Ad genes. Such an approach needs to ensure that adenovirus vectors are safe and non- 
immunogenic. 

SUMMARY OF THE INVENTION 

The present invention contemplates two approaches to improving adenovirus vectors. 
The first approach generally contemplates a recombinant plasmid, together with a helper 
adenovirus, in a packaging cell line. The helper adenovirus is rendered safe by utilization of 
loxP sequences. In the second approach, "damaged" adenoviruses are employed. While the 
"damaged" adenovirus is capable of self-propagation in a packaging cell line, it is not capable 
of expressing certain genes (e.g.. the DNA polymerase gene and/or the adenovirus preterminal 
protein gene). 

In one embodiment of the first approach, the present invention contemplates a 
recombinant plasmid, comprising in operable combination: a) a plasmid backbone, 
comprising an origin of replication, an antibiotic resistance gene and a eukaryotic promoter 
element: b) the left and right inverted terminal repeats (ITRs) of adenovirus, said ITRs each 
having a 5* and a 3‘ end and arranged in a tail to tail orientation on said plasmid backbone: 
c) the adenovirus packaging sequence, said packaging sequence having a 5' and a 3' end and 
linked to one of said ITRs: and d) a first gene of interest operably linked to said promoter 
element. 

While it is not intended that the present invention be limited by the precise size of the 
plasmid, it is generally desirable that the recombinant plasmid have a total size of between 27 
and 40 kilobase pairs. It is preferred that the total size of the DNA packaged into an EAM 
derived from these , recombinant plasmids is about the length of the wild-type adenovirus 
genome (-36 kb). It is well known in the art that DNA representing about 105% of the wild- 
type length may be packaged into a viral particle: thus the EAM derived from recombinant 
plasmid may contain DNA whose length exceeds by -105% the size of the wild-type genome. 
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Tlie size of the recombinant plasmid may be adjusted using reporter cenes and cenes of 
interest having various sizes (including the use of different sizes of introns within these genes) 
as well as through the use of irrelevant or non-coding DNA fragment which act as "stuffer" 
fragments (e g., portions of bacteriophage genomes). 

In one embodiment of the recombinant plasmid, said 5' end of said packaging 
sequence is linked to said 3 end of said left ITR. In this embodiment, said first sene of 
interest is linked to said 3 end of said packaging sequence. It is not intended that the present 
invention be limited by the nature of the gene of interest: a variety of genes (inciudinc both 
cDNA and genomic forms) are contemplated; any gene having therapeutic value may be 
inserted into the recombinant plasmids of the present invention. For example, the transfer of 
the adenosine deaminase (ADA) gene is useful for the treatment of ADA- patients: the 
transfer of the CFTR gene is useful for the treatment of cvstitic fibrosis. A wide variety of 
diseases are known to be due to a defect in a single gene. The plasmids, vectors and EAMs 
of the present invention are useful for the transfer of a non-mutated form of a gene which is 
mutated in a patient thereby resulting in disease. The present invention is illustrated using 
recombinant plasmids capable of generating encapsidated adenovirus minichromosomes 
(EAMs) containing the dystrophin cDNA gene (the cDNA form of this gene is preferred due 
to the large size of this gene): the dystrophin gene is non-functional in muscular dvstrophy 
(MD) patients. However, the present invention is not limited toward the use of the dystrophin 
gene lor treatment of MD: the use of the utrophin (also called the dystrophin related protein) 
gene is also contemplated for gene therapy for the treatment of MD [Tinsley et al. (1993) 

Curr. Opin. Genet. Dev. 3:484 and (1992) Nature 360:591]; the utrophin gene protein has 
k e ® n ..,. re P or t e< -l fo be capable of functionally substituting for the dystrophin tzene [Tinsley and 
Davies ( I99_>) Neuromusc. Disord. _>:5 j 9]. As the utrophin gene product is expressed in the 
muscje of muscular dystrophy patients, no immune response would be directed aeainst the 
utrophin gene product expressed in cells of a host (including a human) containing the 
recombinant plasmids. Ad vectors or EAMs of the present invention. While the present 
invention is illustrated using plasmids containing the dystrophin gene, the plasmids. Ad 
vectors and EAMs of the present invention have broad application for the transfer of any gene 
whose gene product is missing or altered in activity in cells. 

Embodiments are contemplated wherein the recombinant plasmid further comprises a 
second gene of interest. In one embodiment, said second gene of interest is linked to said 3’ 
end of said right ITR. In one embodiment, said second gene of interest is a reporter gene. A 
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variety of reporter genes are contemplated, including but not limited to E. coli P-galactosidase 
gene, the human placental alkaline phosphatase gene, the green fluorescent protein gene and 
the chloramphenicol acetyltransferase gene. 

As mentioned above, the first approach also involves the use of a helper adenovirus in 
combination with the recombinant plasmid. In one embodiment, the present invention 
contemplates a helper adenovirus comprising i) first and a second loxP sequences, and ii) the 
adenovirus packaging sequence, said packaging sequence having a 5* and a 3’ end. It is 
preferred that said first loxP sequence is linked to the 5* end of said packaging sequence and 
said second loxP sequence is linked to said 3' end of said packaging sequence. In one 
embodiment, the helper virus comprises at least one adenovirus gene coding region. 

The present invention contemplates a mammalian cell line containing the above- 
described recombinant plasmid and the above-described helper virus. It is preferred that said 
cell line is a 293-derived cell line. Specifically, in one embodiment, the present invention 
contemplates a mammalian cell line, comprising: a) a recombinant plasmid, comprising, in 
operable combination: i) a plasmid backbone, comprising an origin of replication, an 
antibiotic resistance gene and a eukaryotic promoter element, ii) the left and right inverted 
terminal repeats (ITRs) of adenovirus, said ITRs each having a 5' and a 3’ end and arranged 
in a tail to tail orientation on said plasmid backbone, hi) the adenovirus packaging sequence, 
said packaging sequence having a 5‘ and a 3* end and linked to one of said ITRs. and iv) a 
first gene of interest operably linked to said promoter element: and b) a helper adenovirus 
comprising i) first and a second loxP sequences, and ii) the adenovirus packaging sequence, 
said packaging sequence having a 5* and a 3‘ end. As noted previously, said helper can 
further comprise at least one adenovirus gene coding region. 

Overall, the first approach allows for a method of producing an adenovirus 
minichromosome. In one embodiment, this method comprises: A) providing a mammalian 
cell line containing: a) a recombinant plasmid, comprising, in operable combination, i) a 
plasmid backbone, comprising an origin of replication, an antibiotic resistance gene and a 
eukaryotic promoter element, ii) the left and right inverted terminal repeats (ITRs) of 
adenovirus, said ITRs each having a 5' and a 3’ end and arranged in a tail to tail orientation 
on said plasmid backbone, iii) the adenovirus packaging sequence, said packaging sequence 
having a 5* and a 3* end and linked to one of the ITRs. and iv) a first gene of interest 
operably linked to said promoter element; and b) a helper adenovirus comprising i) first and a 
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second lo.xP sequences. 11 ) at least one adenovirus gene coding region, and iii) the adenovirus 



packaging sequence, said packaging sequence having a 5' and a 3' end: and 
B) growing said cell line under conditions such that said adenovirus gene coding region is 
expressed and said recombinant plasmid directs the production of at least one adenoviral 
minichromosome. It is desired that said adenovirus minichromosome is encapsidated. 

In one embodiment, the present invention contemplates recovering said encapsidated 
adenovirus minichromosome and. in turn, purifying said recovered encapsidated adenovirus 
minichromosome. Thereafter, said purified encapsidated adenovirus minichromosome can be 
administered to a host (e.g . , a mammal). Human therapy is thereby contemplated. 

It is not intended that the present invention be limited by the nature of the 
administration of said minichromosomes. All types of administration are contemplated, 
including direct injection (intra-muscular. intravenous, subcutaneous, etc.), inhalation, etc. 

As noted above, the present invention contemplates a second approaches to improving 
adenovirus vectors. In the second approach, "damaged'' adenoviruses are employed. In one 
embodiment, the present invention contemplates a recombinant adenovirus comprising the 
adenovirus E2b region having a deletion, said adenovirus capable of self-propagation in a 
packaging cell line and said E2b region comprising the DNA polymerase gene and the 
adenovirus preterminal protein gene. In this embodiment, said deletion can be within the 



adenovirus DNA polymerase gene. Alternatively, said deletion is within the adenovirus 



preterminal protein gene. Finally, the present invention also contemplates embodiments 
wherein said deletion is within the adenovirus DNA polymerase and preterminal protein 
uenes. 



The present invention further provides cell lines capable of supporting the propagation 
of Ad virus containing deletions within the E2b region. In one embodiment the invention 
provides a mammalian cell line stably and constitutivelv expressing the adenovirus El gene 
products and the adenovirus DNA polymerase. In one embodiment, these cell lines comprise 
a recombinant adenovirus comprising a deletion within the E2b region, this E2b-deleted 
recombinant adenovirus being capable of self-propagation in the cell line. The present 
invention is not limited by the nature of the deletion within the E2b region. In one 
embodiment, the deletion is within the adenoviral DNA polymerase gene. 

The present invention provides cells lines stably expressing El proteins and the 
adenoviral DNA polymerase, wherein the genome of the cell line contains a nucleotide 
sequence encoding adenovirus DNA polymerase operablv linked to a heterologous promoter. 
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In a particularly preferred embodiment, the cell line is selected from the group consisting of 
the B-6. B-9. C-l. C-4. C-7. C-13. and C-14 cell lines. 



The present invention further provides cell lines which further constitutive!)*’ express 
the adenovirus preterminal protein (pTP) gene product (in addition to El proteins and DNA 
polymerase). In one embodiment, these pTP-expressing cell lines comprise a recombinant 
adenovirus comprising a deletion within the E2b region, the recombinant adenovirus being 
capable of self-propagation in the pTP-expressing cell line. In a preferred embodiment, the 
deletion within the E2b region comprises a deletion within the adenoviral preterminal protein 
gene. In another preferred embodiment, the deletion within the E2b region comprises a 
deletion within the adenoviral (Ad) DNA polymerase and preterminal protein genes. 

In a preferred embodiment, the cell lines coexpressing pTP and Ad DNA polymerase, 
contain within their genome, a nucleotide sequence encoding adenovirus preterminal protein 
operably linked to a heterologous promoter. In the invention is not limited by the nature of 
the heterologous promoter chosen. The art knows well how to select a suitable heterologous 
promoter to achieve expression in the desired host cell (<?.£.. 293 cells or derivative thereof). 

In a particularly preferred embodiment, the pTP- and Ad polymerase-expressing cell line is 
selected from the group consisting of the C-h C-4. C-7. C-13. and C-14 cell lines. 

The present invention provides a method of producing infectious recombinant 
adenovirus particles containing an adenoviral genome containing a deletion within the E2b 
region, comprising: a) providing: i) a mammalian cell line stably and constitutivelv 
expressing the adenovirus El gene products and the adenovirus DNA polymerase: ii) a 
recombinant adenovirus comprising a deletion within the E2b region, the recombinant 
adenovirus being capable of self-propagation in said cell line: b) introducing the recombinant 
adenovirus into the cell line under conditions such that the recombinant adenovirus is 
propagated to form infectious recombinant adenovirus particles: and c) recovering the 
infectious recombinant adenovirus particles. In a preferred embodiment, the method further 
comprises d) purifying the recovered infectious recombinant adenovirus particles. In yet 
another preferred embodiment, the method further comprises e) administering the purified 
recombinant adenovirus particles to a host which is preferably a mammal and most preferably 
a human. 

In another preferred embodiment the mammalian cell line employed in the above 
method further constitutivelv expresses the adenovirus preterminal protein. 
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The present invention further provides a recombinant plasmid capable of replicating in 
a bacterial host comprising adenoviral E2b sequences, the E2b sequences containing a deletion 
within the polymerase gene, the deletion resulting in reduced polymerase activity. The 
present invention is not limited by the specific deletion employed to reduce polymerase 
activity . In a preferred embodiment, the deletion comprises a deletion of nucleotides 8772 to 
9.385 in SEQ ID NO:4. In one preferred embodiment, the recombinant plasmid has the 
designation pApol. In another preferred embodiment, the recombinant plasmid has the 
designation pBHG 11 Apol. 

The present invention also provides a recombinant plasmid capable of replicating in a 
bacterial host comprising adenoviral E2b sequences, the E2b sequences containing a deletion 
within the preterminal protein gene, the deletion resulting in the inability to express functional 
preterminal protein without disruption of the VA RNA genes. The present invention is not 
limited by the specific deletion employed to render the pTP inactive: any deletion within the 
pTP coding region which does not disrupt the ability to express the Ad VA RNA eenes may 
be employed. In a preferred embodiment, the deletion comprises a deletion of nucleotides 
1 0.7Cb to I 1.134 in SEQ ID NO:4. In one preferred embodiment, the recombinant plasmid 
has the designation pApTP. In another preferred embodiment, the recombinant plasmid has 
the designation pBHGl lApTP. 

In a preferred embodiment, the recombinant plasmid containing a deletion with the 
PTP region further comprises a deletion within the polymerase gene, this deletion resulting in 
reduced (preferably absent) polymerase activity. The present invention is not limited by the 
specific deletion employed to inactivate the polymerase and pTP genes. In a preferred 
embodiment, the deletion comprises a deletion of nucleotides 8.773 to 9586 and 1 1.067 to 
12.513 in SEQ ID NO:4. In one preferred embodiment, the recombinant plasmid has the 
designation pAXBApolApTPVARNA+tl3. In another preferred embodiment, the recombinant 
plasmid has the designation pBHGl 1 ApolApTPVARNA+tl3. 

BRIEF DESCRIPTION OF THE FIGURES 

Figure 1A is a schematic representation of the Ad polymerase expression plasmid 
pRSV-pol indicating that the Ad2 DNA polymerase sequences are under the transcriptional 
control of the RSV-LTR/promoter element and are flanked on the 3' end bv the SV-40 small 
t intron and SV-40 polyadenvlation addition site. 
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Figure IB is a schematic representation of the expression plasmid pRSV-pTP 
indicating that the Ad2 preterminal protein sequences are under the transcriptional control of 
the RSV-LTR/promoter element and are Ranked on the 3" end by the SV-40 small-t intron 
and SV-40 polvadenylation signals. 

5 Figure 2 is an ethidium bromide-stained gel depicting the presence of Ad pol DNA 

sequences in genomic DNA from LP-293 cells and several hvgromycin-resistant cell lines. 

The -750 bp PCR products are indicated by the arrow. 

Figure 3 is an autoradiograph depicting the results of a viral replication- 
complementation assay analyzing the functional activity of the Ad polymerase protein 
10 expressed by LP-293 cells and several hygromycin resistant cell lines. The 8.010 bp HindlU 
fragments analyzed by densitometry are indicated by an arrow. 

Figure 4 is an autoradiograph indicating that cell lines B-6 and C-7 contained a 
smaller and a larger species of Ad polymerase mRNA while LP-293 derived RNA had no 
detectable hybridization signal. The location of the two species of Ad polymerase mRNA are 
15 indicated relative to the 28S and 18S ribosomal RNAs. and the aberrant transcript expressed 
by the B-9 cell line is indicated by an arrow. 

Figure 5 is an autoradiograph showing which of the cell lines that received the 
preterminal protein expression plasmid indicated the presence of pRSV-pTP sequences (arrow 
labelled "pTP") and El sequences (arrow labelled "El"). 

20 Figure 6 is an autoradiograph indicating in which of the cell lines transcription of 

preterminal protein is occurring (arrow labelled 3kb"). 

Figure 7 is an autoradiograph indicating that the expression of the Ad-polvmerase 
could overcome the replication defect of H5sub\00 at non-permissive temperatures. 

Figure 8 graphically depicts plaque titre for LP-293, B-6 and C-7 cell lines infected 
25 with vvtAdS. H5ts36. or H5 .ym 6100. and the results demonstrate that the C-7 cell line can be 
used as a packaging cell line to allow the high level growth of El. preterminal. and 
polymerase deleted Ad vectors. 

Figure 9 is an autoradiograph showing that the recombinant pol" virus is viable on 
pol-expressing 293 cells but not on 293 cells which demonstrates that recombinant Ad viruses 
30 containing the 612 bp deletion found within pApol lack the ability to express Ad polymerase. 

Figure 10 is a schematic representation of the structure of pAdSpdvs wherein the two 
inverted adenovirus origins of replication are represented by a left and right inverted terminal 
repeat (LITR/RITR). PI and P2 represent location of probes used for ’Southern blot analysis. 
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Figure 1 1 graphically illustrates the total number of transducing adenovirus particles 
produced (output) per serial passage on 293 cells, total input virus of either the helper (hpAP) 
or Ad5(3dys. and the total number of cells used in each infection. 

Figures I2A-B show Southern blot analyses of viral DNA from lysates 3. 6. 9 and 12. 
digested with the restriction enzymes fesHIl. Nrul and EcoRV. For the analyses, fragments 
from the C terminus of mux musculus dystrophin cDNA (A) or the N terminus of E coli P- 
galactosidase (B) were labeled with dCTP 32 and used as probes. 

Figures 13A-B show the physical separation of Ad5pdvs from hpAP virions at the 
third (A) and Final (B) stages of CsCl purification. 

Figure 14 graphically depicts the level of contamination of Ad5(3dvs EAMs by hpAP 
virions obtained trom the final stage of CsCl purification as measured by P-galactosidase and 
alkaline phosphatase expression. The ratio of the two types of virions - AdSpdys EAMs 
( LacZ) or hpAP ( AP) in each fraction is indicated in the lower graph. 

Figures 15A-B are western blot (immunoblot) analyses of protein extracts from mdx 
myoblasts and mvotubes demonstrating the expression of P-galactosidase (A) and dystrophin 
(B) in cells infected with Ad5pdvs EAMs. 

Figures I6A-C depict immunofluorescence of dystrophin expression in wild type 
MM 1 4 myotubes (A), uninfected mdx (B) and infected mdx mvotubes (C). 

Figure 1 7 is a schematic representation of the MCK/lacZ constructs tested to determine 
what portion of the - 3.3 kb DNA fragment containing the enhancer/promoter of the MCK 
gene is capable of directing high levels of expression of linked genes in muscle cells. 

Figure 18 is a schematic representation ot a GFP/p-gal reporter construct suitable for 
assaying the expression of Crc recombinase in mammalian cells. 

Figure 19 is a schematic representation of the recombination event between the loxP 
shuttle vector and the Ad5<//7001 genome. 

DEFINITIONS 

To facilitate understanding of the invention, a number of terms are defined below. 

The term "gene" refers to a DNA sequence that comprises control and coding 
sequences necessary for the production of a polypeptide or precursor thereof. The polypeptide 
can be encoded by a full length coding sequence or by any portion of the coding sequence so 
long as the desired enzymatic activity is retained. The term "gene" encompasses both cDNA 
and genomic forms of a given gene. 
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The term "wild-type" refers to a gene or gene product which has the characteristics of 
that gene or gene product when isolated from a naturally occurring source. A wild-type gene 
is that which is most frequently observed in a population and is thus arbitrarily designated the 
"normal" or "wild-tvpe" form of the gene. In contrast, the term "modified" or "mutant" refers 
5 to a gene or gene product which displays modifications in sequence and or functional 

properties (/'. e.. altered characteristics) when compared to the wild-type gene or gene product, 
it is noted that naturally-occurring mutants can be isolated; these are identified by the fact that 
they have altered characteristics when compared to the wild-tvpe gene or gene product. 

The term "oligonucleotide" as used herein is defined as a molecule comprised of two 
10 or more deoxvribonucleotides or ribonucleotides, usually more than three (3). and typically 

more than ten (10) and up to one hundred (100) or more (although preferably between twenty 
and thirty). The exact size will depend on many factors, which in turn depends on the 
ultimate function or use of the oligonucleotide. The oligonucleotide may be generated in any 
manner, including chemical synthesis. DNA replication, reverse transcription, or a 
15 combination thereof. 

As used herein, the term "regulatory element" refers to a genetic element which 
controls some aspect of the expression of nucleic acid sequences. For example, a promoter is 
a regulatory element which facilitates the initiation of transcription of an operablv linked 
coding region. Other regulatory elements are splicing signals, polyadenvlation signals. 

20 termination signals, etc. (defined infra). 

Transcriptional control signals in eucaryotes comprise "promoter" and "enhancer" 
elements. Promoters and enhancers consist of short arrays of DNA sequences that interact 
specifically with cellular proteins involved in transcription [Maniatis. T. el al.. Science 
236:1237 ( 1987)]. Promoter and enhancer elements have been isolated from a variety of 
25 eukaryotic sources including genes in yeast, insect and mammalian cells and viruses 

(analogous control elements. /. e . . promoters, are also found in procaryotes). The selection of 
a particular promoter and enhancer depends on what cell type is to be used to express the 
protein of interest. Some eukaryotic promoters and enhancers have a broad host range while 
others are functional in a limited subset of cell types [for review see Voss. S.D. et al.. Trends 
30 Biochem. Sci.. 1 1:287 (1986) and Maniatis. T. et al..- supra (1987)]. 

The term "recombinant DNA vector" as used herein refers to DNA sequences 
containing a desired coding sequence and appropriate DNA sequences necessary tor the 
expression of the operablv linked coding sequence in a particular host organism (e.g. 
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mammal). DNA sequences necessary for expression in procaryotes include a promoter, 
optionally an operator sequence, a ribosome binding site and possibly other sequences. 
Eukaryotic cells are known to utilize promoters, polyadenlvation signals and enhancers. 

The terms in operable combination", "in operable order" and "operably linked" as 
used herein refer to the linkage of nucleic acid sequences in such a manner that a nucleic acid 
-molecule capable of directing the transcription of a given gene and/or the synthesis of a 
desired protein molecule is produced. The term also refers to the linkage of amino acid 
sequences in such a manner so that a functional protein is produced. 

The term "genetic cassette” as used herein refers to a fragment or segment of DNA 
containing a particular grouping of genetic elements. The cassette can be removed and 
inserted into a vector or plasmid as a single unit. A plasmid backbone refers to a piece of 
DNA containing at least plasmid origin of replication and a selectable marker gene (e.g., an 
antibiotic resistance gene) which allows for selection of bacterial hosts containing the plasmid: 
the plasmid backbone may also include a polvlinker region to facilitate the insertion of 
genetic elements within the plasmid. When a particular plasmid is modified to contain non- 
plasmid elements (e.g.. insertion ot Ad sequences and/or a eukaryotic gene of interest linked 
to a promoter element), the plasmid sequences are referred to as the plasmid backbone. 

Because mononucleotides are reacted to make oligonucleotides in a manner such that 
the a phosphate of one mononucleotide pentose ring is attached to the 3‘ oxyeen of its 
neighbor in one direction via a phosphodiester linkage, an end of an oligonucleotide is 
referred to as the "5 end" if its 5' phosphate is not linked to the 3' oxyeen of a 
mononucleotide pentose ring and as the ”3' end" if its 3' oxygen is not linked to a 5' 
phosphate ot a subsequent mononucleotide pentose ring. As used herein, a nucleic acid 
sequence, even if internal to a larger oligonucleotide, also may be said to have 5' and 3' ends, 
•j. • When two different, non-overlapping oligonucleotides anneal to different regions of 

the same linear complementary nucleic acid sequence, and the 3' end of one oligonucleotide 
points towards the 5 end ot the other, the former may be called the "upstream" 
-oligonucleotide and the latter the "downstream" oligonucleotide. 

The term "primer" refers to an oligonucleotide which is capable of acting as a point of 
initiation ot synthesis when placed under conditions in which primer extension is initiated. 

An oligonucleotide ’primer" may occur naturally, as in a purified restriction digest or may be 
produced synthetically. 
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A primer is selected to be "substantially" complementary to a strand of specific 
sequence of the template. A primer must be sufficiently complementary to hybridize with a 
template strand for primer elongation to occur. A primer sequence need not reflect the exact 
sequence of the template. For example, a non-complementary nucleotide fragment may be 
5 attached to the 5* end of the primer, with the remainder of the primer sequence being 

substantially complementary to the strand. Non-complementary bases or longer sequences can 
be interspersed into the primer, provided that the primer sequence has sufficient 
complementarity with the sequence of the template to hybridize and thereby form a template 
primer complex for synthesis of the extension product of the primer. 

10 "Hybridization" methods involve the annealing of a complementary sequence to the 

target nucleic acid (the sequence to be detected). The ability of two polymers of nucleic acid 
containing complementary sequences to find each other and anneal through base pairing 
interaction is a well-recognized phenomenon. The initial observations of the "hybridization" 
process by Marmur and Lane. Proc. Natl. Acad. Sci. USA 46:453 (1960) and Doty cr ai . 

15 Proc . Natl. Acad. Sci. USA 46:461 (I960) have been followed by the refinement of this 

process into an essential tool of modern biology. 

The complement of a nucleic acid sequence as used herein refers to an oligonucleotide 
which, when aligned with the nucleic acid sequence such that the 5‘ end of one sequence is 
paired with the 3* end of the other, is in "antiparallel association." Certain bases not 
20 commonly found in natural nucleic acids may be included in the nucleic acids of the present 
invention and include, for example, inosine and 7-deazaguanine. Complementarity need not 
be perfect; stable duplexes may contain mismatched base pairs or unmatched bases. Those 
skilled in the art of nucleic acid technology can determine duplex stability empirically 
considering a number of variables including, for example, the length of the oligonucleotide. 
25 base composition and sequence of the oligonucleotide, ionic strength and incidence of 
mismatched base pairs. 

Stability of a nucleic acid duplex is measured by the melting temperature, or ”T in ." 
The T m of a particular nucleic acid duplex under specified conditions is the temperature at 
which on average half of the base pairs have disassociated. The equation for calculating the 
0 T m of nucleic acids is well known in the art. 

The term "probe" as used herein refers to a labeled oligonucleotide which forms a 
duplex structure with a sequence in another nucleic acid, due to complementarity of at least 
one sequence in the probe with a sequence in the other nucleic acid. 
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The term "label ' as used herein refers to any atom or molecule which can be used to 

provide a detectable (preferably quantifiable) signal, and which can be attached to a nucleic 
acid or protein. Labels may provide signals detectable by fluorescence, radioactivity. 

colorimetry, gravimetry. X-ray diffraction or absorption, magnetism, enzymatic activity, and 
the like. 

The terms "nucleic acid substrate" and nucleic acid template" are used herein 
interchangeably and refer to a nucleic acid molecule which may comprise single- or double- 
stranded DNA or RNA. 

Oligonucleotide primers matching or complementary to a gene sequence” refers to 
oligonucleotide primers capable of facilitating the template-dependent synthesis of single of 
double-stranded nucleic acids. Oligonucleotide primers matching or complementary to a acne 
sequence may be used in PCRs. RT-PCRs and the like. 

A consensus gene sequence" refers to a gene sequence which is derived by 
comparison of two or more gene sequences and which describes the nucleotides most often 
present in a given segment of the genes: the consensus sequence is the canonical sequence. 

The term "polymorphic locus" is a locus present in a population which shows variation 
between members of the population the most common allele has a frequency of less than 
0.95). In contrast, a "monomorphic locus” is a genetic locus at little or no variations seen 
between members of the population (generally taken to be a locus at which the most common 
allele exceeds a frequency of 0.95 in the gene pool of the population). 

The term "microorganism" as used herein means an organism too small to be observed 
with the unaided eye and includes, but is not limited to bacteria, viruses, protozoans, fungi, 
and ciliates. 

The term "microbial gene sequences" refers to gene sequences derived from a 
microorganism. 

The term "bacteria" refers to any bacterial species including eubacterial and 
archaebacterial species. 

The term "virus" refers to obligate, ultramieroscopic. intracellular parasites incapable 
of autonomous replication (i.e.. replication requires the use of the host cell’s machinery). 
Adenoviruses, as noted above, are double-stranded DNA viruses. The left and right inverted 
terminal repeats (ITRs) are short elements located at the 5* and 3’ termini of the linear Ad 
genome, respectively and are required for replication of the viral DNA. The left ITR is 
located between 1-130 bp in the Ad genome (also referred to as 0-0.5 mu). The right ITR is 
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located from -3.7500 bp to the end of the genome (also referred to as 00.5- 1 00 mui. The two 
ITRs are inverted repeats of each other. For clarity, the left (TR or 5* end is sued define the 
5* and 3’ ends of the ITRs. The 5* end of the left ITR is located at the extreme 5* end of the 
linear adenoviral genome: picturing the left ITR (LITR) as an arrow extending from the 5* 
end of the genome, the tail of the 5‘ ITR is located at mu 0 and the head of the left ITR is 
located at - 0.5 mu (further the tail of the left ITR is referred to as the 5’ end of the left ITR 
and the head of the left ITR is referred to as the 3' end of the left ITR). The tail of the right 
or 3* ITR is located at mu 100 and the head of the rigt ITR is located at - mu 99.5: the head 
of the right ITR is referred to as the 5' end of the right ITR and the tail of the right ITR is 
referred to as the 3' end of the right ITR (RITR). In the linear Ad genome, the ITRs face 
each other with the head of each ITR pointing inward toward the bulk of the genome. When 
arranged in a "tail to tail orientation" the tails of each ITR (which comprise the 5’ end of the 
LITR and the 3‘ end of the RITR) are located in proximity to one another while the heads of 
each ITR are separated and face outward (see for example, the arrangement of the ITRs in the 
EAM shown in Figure 10 herein). The "adenovirus packaging sequence" refers to the T* 
sequence which comprises five (AI-AV) packaging signals and is required for encapsidation 
of the mature linear genome: the packaging signals are located from -194 to 358 bp in the Ad 
genome (about 0. 5-1.0 mu). 

The phrase "at least one adenovirus gene coding region" refers to a nucleotide 
sequence containing more than one adenovirus gene coding gene. A "helper adenovirus" or 
"helper virus" refers to an adenovirus which is replication-competent in a particular host cell 
(the host may provide Ad gene products such as El proteins), this replication-competent virus 
is used to supply in trans functions (e.g.. proteins) which are lacking in a second replication- 
incompetent virus: the first replication-competent virus is said to "help" the second 
replication-incompetent virus thereby permitting the propagation of the second viral genome 
in the cell containing the helper and second viruses. 

The term "containing a deletion within the E2b region" refers to a deletion of at least 
one basepair (preferably more than one bp and preferably at least 100 and most preferably 
more than 300 bp) within the E2b region of the adenovirus genome. An E2b deletion is a 
deletion that prevents expression of at least one E2b gene product and encompasses deletions 
within exons encoding portions of E2b-specific proteins as well as deletions within promoter 
and leader sequences. 
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An "adenovirus minichromosome" refers to a linear molecule of DNA containing the 

Ad ITRs on each end which is generated from a plasmid containing the ITRs and one or more 
gene ot interest. The term "encapsidated adenovirus minichromosome" or "EAM" refers to an 
adenovirus minichromosome which has been packaged or encapsidated into a viral particle: 
plasmids containing the Ad ITRs and the packaging signal are shown herein to produce 
EAMs. When used herein, "recovering" encapsidated adenovirus minichromosomes refers to 
the collection ot EAMs from a cell containing an EAM plasmid and a helper virus: this cell 
will direct the encapsidation of the minichromosome to produce EAMs. The EAMs may be 
recovered from these cells by lysis of the cell (e.g., freeze-thawing) and pelleting of the cell 
debris to a cell extract as described in Example I (Ex. I describes the recovery of Ad virus 
Irom a cell, but the same technique is used to recover EAMs from a cell). "Purifying” such 
minichromosomes refers to the isolation of the recovered EAMs in a more concentrated form 
(relative to the cell lysate) on a density gradient as described in Example 7: purification of 

recovered EAMs permits the physical separation of the EAM from any helper virus (if 
present). 

The term "transfection" as used herein refers to the introduction of foreign DNA into 
eukaryotic cells. Transfection may be accomplished by a variety of means known to the art 
including calcium phosphate-DNA co-precipitation. DEAE-dextran-mediated transfection, 
polybrene-mediated transfection, electroporation, microinjection, liposome fusion, lipofection. 
protoplast fusion, retroviral infection, and bioiistics. 

The term "stable transfection" or "stably transfected" refers to the introduction and 
integration ot foreign DNA into the genome of the transfected cell. The term "stable 
transtectant" refers to a cell which has stably integrated foreign DNA into the genomic DNA. 

As used herein, the term "gene of interest" refers to a gene inserted into a vector or 
plasmid whose expression is desired in a host cell. Genes of interest include genes having 
therapeutic value as well as reporter genes. A variety of such genes are contemplated, 
including genes of interest encoding a protein which provides a therapeutic function (such as 
the dystrophin gene, which is capable of correcting the defect seen in the muscle of MD 

patients), the utrophin gene, the CFTR gene (capable of correcting the defect seen in cystitic 
fibrosis patients), etc.. 

The term "reporter gene" indicates a gene sequence that encodes a reporter molecule 
(including an enzyme). A "reporter molecule" is detectable in any detection system, 
including, but not limited to enzyme {e.g:, ELISA, as well as enzvme-based histochemical 




10 



WO 98/17783 



PCT/US97/1 954 1 



assays), fluorescent, radioactive, and luminescent systems. In one embodiment, the present ' 
invention contemplates the £. coli P-galactosidase gene (available from Pharmacia Biotech. 
Pistacataway. NJ). green fluorescent protein (GFP) (commercially available from Clontech. 
Palo Alto. CA). the human placental alkaline phosphatase gene, the chloramphenicol 
acetyltransferase (CAT) gene; other reporter genes are known to the art and may be 
employed. 

As used herein, the terms "nucleic acid molecule encoding.*' ”DNA sequence 
encoding." and "DNA encoding" refer to the order or sequence of deoxvribonucleotides along 
a strand of deoxyribonucleic acid. The order of these deoxvribonucleotides determines the 
order of amino acids along the polypeptide (protein) chain. The DNA sequence thus codes 
for the amino acid sequence. 



DESCRIPTION OF THE INVENTION 

The present invention provides improved adenovirus vectors for the delivery of 
15 recombinant genes to cells in vitro and in vivo. As noted above, the present invention 

contemplates two approaches to improving adenovirus vectors. The first approach generally 
contemplates a recombinant plasmid containing the minimal region of the Ad genome 
required for replication and packaging ( /. a . . the left and right ITR and the packaging or T 
sequence) along with one or more genes of interest; this recombinant plasmid is packaged into 
20 an encapsidated adenovirus minichromosome (EAM) when grown in parallel with an El- 
deleted helper virus in a cell line expressing the El proteins ( e. g . . 293 cells). The 
recombinant adenoviral minichromosome is preferentially packaged. To prevent the 
packaging of the helper virus, a helper virus containing loxP sequences Hanking the V F 
sequence is employed in conjunction with 293 cells expressing Cre recombinase: Cre-loxP 
25 mediated recombination removes the packaging sequence form the helper genome thereby 

preventing packaging of the helper during the production of EAMs. In the second approach, 
"damaged" or "deleted" adenoviruses containing deletions within the E2b region are 
employed. While the "damaged" adenovirus is capable of self-propagation in a packaging cell 
line expressing the appropriate E2b protein(s), the E2b-deleted recombinant adenovirus are 
30 incapable of replicating and expressing late viral gene products outside of the packaging cell 
line. 

In one embodiment, the self-propagating recombinant adenoviruses containing 
deletions in the E2b region of the adenovirus genome. In another embodiment, "gutted" 
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. viruses are contemplated: these viruses lack all viral coding regions. In addition, packacina 

cell lines co-expressing El and E2b gene products are provided which allow the production of 
infectious recombinant virus containing deletions in the El and E2b regions without the use 
of helper virus. 

* . The Description ol the Invention is divided into the following sections: I. Self- 

Propagating Adenovirus Vectors: II. Packaging Cell Lines: and III. Encapsidated Adenoviral 
Minicliromosomes. 

I. Self-Propagating Adenovirus Vectors 

Selt-propagating adenovirus (Ad) vectors have been extensively utilized to deliver 
foreign genes to a great variety of cell types in vitro and in vivo. "Self-propagating viruses" 
are those which can be produced by transfection of a single piece of DNA (the recombinant 
\iral genome) into a single packaging cell line to produce infectious virus: self-propauating 
viruses do not require the use of helper virus for propagation. 
i:> Existing Ad vectors have been shown to be problematic in vivo. This is due in part 

because current or first generation Ad vectors are deleted for only the early recion I (El) 
genes. These vectors are crippled in their ability to replicate normally without the trans- 
complementation ol El functions provided by human 293 cells, a packacina cell line [ATCC 
CRT 1573; Graham et cl. (1977) J. Gen. Virol. 36:59], Unfortunately, with the use of high 
20 litres of El deleted vectors, and the fact that there are El -like factors present in many cell 
types. El deleted vectors can overcome the block to replication and express other viral gene 
products [Imperiale et al. (1984) Mol. Cell Biol. 4:867: Nevins (1981) Cell 26:213: and 
Gaynor and Berk (1983) Cell 33:683], The expression of viral proteins in the infected target 
; ,.cell s elicits a swift host immune response, that is largely T-cell mediated [Yang and Wilson 
-- -(1995) J. Immunol. 155:2564 and Yang et al. (1994) Proc. Natl. Acad. Sci. USA 91:4407], 

The transduced cells are subsequently eliminated, along with the transferred foreign gene. In 
immuno-incompetent animals. Ad delivered genes can be expressed for periods of up to one 
-year [Yang et al. (1994). supra: Vincent et al. (1993) Nature Genetics 5:130: and Yang et al. 
(1995) Proc. Natl. Acad. Sci. USA 92:7257], 

Another shortcoming of first generation Ad vectors is that a single recombination 
event between the genome ot an Ad vector and the integrated El sequences present in 293 
cells can generate replication competent Ad (RCA), which can readily contaminate viral 
stocks. 



- 17 - 




10 



15 



20 



30 



WO 98/17783 



PCT/US97/19541 



III order to Turther cripple viral protein expression, and also to decrease the frequency 
of generating RCA. the present invention provides Ad vectors containing deletions in the E2b 
region. Propagation of these E2b-deleted Ad vectors requires ceil lines which express the 
deleted E2b gene products. The present invention provides such packaging cell lines and for 
the first time demonstrates that the E2b gene products. DNA polymerase and preterminal 
protein, can be constitutiveiy expressed in 293 cells along with the El gene products. With 
every gene that can be constitutiveiy expressed in 293 cells comes the opportunity to generate 
new versions of Ad vectors deleted for the respective genes. This has immediate benefits; 
increased carrying capacity, since the combined coding sequences of the polymerase and 
preterminal proteins that can be theoretically deleted approaches 4.6 kb and a decreased 
incidence of RCA generation, since two or more independent recombination events would be 
required to generate RCA. Therefore, the novel El. Ad polymerase and preterminal protein 
expressing cell lines of the present invention enable the propagation of Ad vectors with a 
carrying capacity approaching 13 kb. without the need for a contaminating helper virus 
[Mitani et ctl. (1995) Proc. Natl. Acad. Sci. USA 92:3854]. In addition, when genes critical 
to the viral life cycle are deleted (e.g.* the E2b genes), a further crippling of Ad to replicate 
and express other viral gene proteins occurs. This decreases immune recognition of virallv 
infected cells, and allows for extended durations of foreign gene expression. The most 
important attribute of El. polymerase, and preterminal protein deleted vectors, however, is 
their inability to express the respective proteins, as well as a predicted lack of expression of 
most of the viral structural proteins. For example, the major late promoter (MLP) of Ad is 
responsible for transcription of the late structural proteins LI through L5 [Doerfler. In 
Adenovirus DNA . The Viral Genome and Its Expression (Martinus Nijhoff Publishing Boston. 
1986)]. Though the MLP is minimally active prior to Ad genome replication, the rest of the 
late genes get transcribed and translated from the MLP only after viral genome replication has 
occurred [Thomas and Mathews (1980) Cell 22:523]. This cis-dependent activation of late 
gene transcription is a feature of DNA viruses in general, such as in the growth of polyoma 
and SV-40. The polymerase and preterminal proteins are absolutely required for Ad 
replication (unlike the E4 or protein IX proteins) and thus their deletion is extremely 
detrimental to Ad vector late gene expression. 
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II. Packaging Cell Lines Constitutivelv Expressing E2b Gene 
Products 

The present invention addresses the limitations of current or first generation Ad 
vectors by isolating novel 293 cell lines coexpressing critical viral gene functions. The 
present invention describes the isolation and characterization of 293 cell lines capable of 
constitutivelv expressing the Ad polymerase protein. In addition, the present invention 
describes the isolation of 293 cells which not only express the El and polymerase proteins, 
but also the Ad-preterminal protein. The isolation of cell lines coexpressing the El. Ad 
polymerase and preterminal proteins demonstrates that three genes critical to the life cycle of 
Ad can be constitutivelv coexpressed, without toxicity. 

In order to delete critical genes from self-propagating Ad vectors, the proteins encoded 
by the targeted genes have to first be coexpressed in 293 cells along with the El proteins. 
Thereto! e. only those proteins which are non-toxic when coexpressed constitutivelv (or toxic 
proteins induciblv-expressed) can be utilized. Coexpression in 293 cells of the El and E4 
genes has been demonstrated (utilizing inducible, not constitutive, promoters) [Yeh el al. 
(1996) J. Virol. 70:559; Wang el al. (1995) Gene Therapy 2:775; and Gorziglia el al. (1996) 

J. Virol. 70:4 173J. The El and protein IX genes (a virion structural protein) have been 
coexpressed [Caravokyri and Leppard (1995) J. Virol. 69:66271. and coexpression of the El. 
E4. and protein IX genes has also been described [Krougliak and Graham (1995) Hum. Gene 
Ther. 6:1575]. 

The present invention provides for the first time, cell lines coexpressing El and E2b 
gene products. The E2b region encodes the viral replication proteins which are absolutely 
required for Ad genome replication [Doerfler. supra and Pronk el al. (1992) Chromosoma 
1 02 .S j 9-S45J. The present invention provides 293 cells which constitutively express the 140 
kD Ad-polvmerase. While other researchers have reported the isolation of 293 ceils which 
express the Ad-preterminal protein. utilizing an inducible promoter [Schaack el al. (1995) J. 
Virol. 69:4079], the present invention is the first to demonstrate the high-level, constitutive 
coexpression of the EL polymerase, and preterminal proteins in 293 cells, without toxicity. 
These novel cell lines permit the propagation of novel Ad vectors deleted for the El, 
polymerase, and preterminal proteins. 
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111. Encapsidated Adenoviral Minichromosomes 

The present invention also provides encapsidated adenovirus minichromosome (EAM) 
consisting of an infectious encapsidated linear genome containing Ad origins of replication, 
packaging signal elements, a reporter gene (e.g . . a (3-galactosidase reporter gene cassette) and 
a gene of interest a full length (14 kb) dystrophin cDNA regulated by a muscle specific 
enhancer/promoter). EAMs are generated by cotransfecting 293 cells with supercoiled 
plasmid DNA (e.g.. pAd5pdvs) containing an embedded inverted origin of replication (and the 
remaining above elements) together with linear DNA from El -deleted virions expressing 
human placental alkaline phosphatase (hpAP) (a helper virus). All proteins necessary for the 
generation of EAMs are provided in trans from the hpAP virions and the two can be 
separated from each other on equilibrium CsCl gradients. These EAMs are useful for gene 
transfer to a variety of cell types both in vitro and in vivo. 

Adenovirus-mediated gene transfer to muscle is a promising technology for gene 
therapy of Duchenne muscular dystrophy (DMD). However, currently available recombinant 
adenovirus vectors have several limitations, including a limited cloning capacity of -8.5 kb. 
and the induction of a host immune response that leads to transient gene expression of 3 to 4 
weeks in immunocompetent animals. Gene therapy for DMD could benefit from the 
development of adenoviral vectors with an increased cloning capacity to accommodate a full 
length (-14 kb) dystrophin cDNA. This increased capacity should also accommodate gene 
regulatory elements to achieve expression of transduced genes in a tissue-specific manner. 
Additional vector modifications that eliminate adenoviral genes, expression of which is 
associated with development of a host immune response, might greatly increase long term 
expression of virallv delivered genes in vivo. The constructed encapsidated adenovirus 
minichromosomes of the present invention are capable of delivering up to 35 kb of non-viral 
exogenous DNA. These minichromosomes are derived from bacterial plasmids containing 
two fused inverted adenovirus origins of replication embedded in a circular genome, the 
adenovirus packaging signals, a Pgalactosidase reporter gene and a full length dystrophin 
cDNA regulated by a muscle specific enhancer/promoter. The encapsidated minichromosomes 
are propagated in vitro by f/w 7 .y-complementation with a replication defective (E1+E3 deleted) 
helper virus. These minichromosomes can be propagated to high titer (>10 8 /ml) and puritied 
on CsCl gradients due to their buoyancy difference relative to helper virus. These vectors are 
able to transduce myogenic cell cultures and express dystrophin in mvotubes. These results 
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demonsirate that encapsidated adenovirus minichromosomes are useful for uene transfer to 



muscle and other tissues. 

The present invention further provides methods for modifying the above-described 
EAM system to enable the generation of high titer stocks of EAtMs with minimal helper virus 
contamination. Preferably the EAM stocks contain helper virus represenline less than 1%. 
preferably less than 0.1% and most preferably less than 0.01% (including 0.0%) of the final 
viral isolate. 

The amount of helper virus present in the EAM preparations is reduced in two ways. 
The first is by selectively controlling the relative packaging efficiency of the helper virus 
versus the EAM virus. The Cre-loxP excision method is employed to remove the packaging 
: signals from the helper virus thereby preventing the packaging of the helper virus used to 
provide in trans viral proteins for the encapsidation of the recombinant adenovirus 
minichromosomes. The second approach to reducing or eliminating helper virus in EAM 
stocks is the use of improve physical methods for separating EAM from helper virus. 



EXPERIMENTAL 

The following examples serve to illustrate certain preferred embodiments and aspects 
of the present invention and are not to be construed as limiting the scope thereof. 

In the experimental disclosure which follows, the following abbreviations apply; M 
(molar); mM (millimolar): pM (micromolar): mol, (moles); mmol (millimoles); pmo! 
(micromoles); nmol (nanomoles); mu or m.u. (map unit); g (gravity); gm (arams); me 
(milligrams): pg (micrograms): pg (picograms); L (liters): ml (milliliters): pi (microliters): cm 
(centimeters): mm (millimeters): pm (micrometers): nm (nanometers): hr (hour): min 
’-(minute): msec (millisecond): *C (degrees Centigrade); AMP (adenosine 5 '-monophosphate): 
■;j‘£DNA (copy or complimentary DNA); DTT (dithiotheritol): ddH-,0 (double distilled water); 
dNTP (deoxyribonucleotide triphosphate); rNTP (ribonucleotide triphosphate); ddNTP 
(dideoxyribonucleotide triphosphate): bp (base pair): kb (kilo base pair); TLC (thin layer 
-chromatography): tRNA (transfer RNA); nt (nucleotide); VRC (vanadyl ribonucleoside 
complex): RNase (ribonuclease): DNase (deoxyribonuclease); poly A (polyriboadenylic acid); 
PBS (phosphate buffered saline); OD (optical density); HEPES (N-[2- 
Hvdroxyethyl]piperazine-N-[2-ethanesuifonic acid]): HBS (HEPES buffered saline): SDS 
(sodium dodecyl sulfate); Tris-HCl (tris[Hydro.xymethyl]aminomethane-hydrochloride); rpm 
(revolutions per minute): ligation buffer (50 mM Tris-HCl. 10 mM MgCT. 10 mM 
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dithiothreitol. 25 pg/ml bovine serum albumin, and 26 uM NAD- 5 -, and pH 7.8): EGTA 
(ethylene glycol-bis(P-aminoethyl ether) N. N. N'. N-tetraacetic acid): EDTA 
(ethylenediaminetetracetic acid); ELiSA (enzyme linked immunosorbant assay): LB (Luria- 
Bertani broth: 10 g tryptone. 5 g yeast extract, and 10 g NaCl per liter. pH adjusted to 7.5 
with IN NaOH); superbroth (12 g tryptone. 24 g yeast extract. 5 g glycerol. 3.8 g KH^POj 
and 12.5 g. K-,HP0 4 per liter): DMEM (Dulbecco's modified Eagle's medium): ABI (Applied 
Biosvstems Inc.. Foster City. CA): Amersham (Amersham Corporation. Arlington Heights. 

1L): ATCC (American Type Culture Collection. Rockville. MY): Beckman (Beckman 
Instruments Inc.. Fullerton CA): BM (Boehringer Mannheim Biochemicals. Indianapolis. IN); 
Bio- 101 (Bio- 101. Vista. CA); BioRad (BioRad. Richmond. CA): Brinkmann (Brinkmann 
Instruments Inc. Wesburv. NY); BRL. Gibco BRL and Life Technologies (Bethesda Research 
Laboratories. Life Technologies Inc.. Gaithersburg. MD): CRI (Collaborative Research Inc. 
Bedford. MA): Eastman Kodak (Eastman Kodak Co.. Rochester. NY): Eppendorf (Eppendorf. 
Eppendorf North America. Inc.. Madison. WI); Falcon (Becton Dickenson Labware. Lincoln 
Park. NJ): IBI (International Biotechnologies. Inc.. New Haven. CT): ICN (ICN Biomedicals. 
Inc.. Costa Mesa. CA): Invitrogen (Invitrogen. San Diego. CA); New Brunswick (New 
Brunswick Scientific Co. Inc.. Edison. NJ): NEB (New England BioLabs Inc.. Beverly. MA): 
NEN (Du Pont NEN Products. Boston. MA); Pharmacia (Pharmacia LKB Gaithersburg, MD): 
Promega (Promega Corporation. Madison. WI); Stratagene (Stratagene Cloning Systems. La 
Jolla. CA): UVP (UVP. Inc.. San Gabreil. CA); USB (United States Biochemical Corp.. 
Cleveland. OH); and W'hatman (Whatman Lab. Products Inc. Clifton. NJ). 

Unless otherwise indicated, all restriction enzymes and DNA modifying enzymes were 
obtained from New England Biolabs (NEB) and used according to the manufacturers 
directions. 



EXAMPLE 1 

Generation Of Packaging Cell Lines That Coexpress 
The Adenovirus El And DNA Polymerase Proteins 



30 In this example, packaging cell lines coexpressing Ad El and polymerase proteins 

were described. These cell lines were shown to support the replication and growth of H5ts36. 
an Ad with a temperature-sensitive mutation of the Ad polymerase protein. These 
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lines can be used to prepare Ad vectors deleted for the 



a) Tissue Culture And Virus Growth 

LP-293 cells (Microbix Biosystems. Toronto) were grown and serially passaged as 
suggested by the supplier. 

Plaque assays were performed in 60 mm dishes containing cell monolayers at ~ 90% 
coniluency. The appropriate virus dilution in a 2% DMEM solution was dripped onto the 
cells, and the plates incubated at the appropriate temperature for one hour. The virus 
containing media was aspirated, the monolayer was overlaid with 10 mis of a pre- warmed 
EMEM agar overlay solution (0.8% Noble agar. 4% fetal calf serum, and ant.biotics) and 
allowed to solidify. After the appropriate incubation time (usually 7 days for incubations at 
38.5°C and 10-12 days for incubations at 32°C). live mis of the agar-containing solution 
containing \.j% neutral red was overlaid onto the infected dishes and plaques were counted 
the next day. An aliquot of the virus H5ts36 [Freimuth and Ginsberg (1986) Proc. Natl. 
Acad. Sci. USA 83:7816] was utilized to produce high titre stocks after infection of 293 cells 
at 32°C. Hots36 is an Ad5-derived virus defective for viral replication at the nonpermissive 
temperature [Miller and Williams (1987) J. Virol. 61:3630]. 

The infected cells were harvested after the onset of extensive cytopathic effect, 
pelleted by centrifugation and resuspended in 10 mM Tris-Cl. pH 8.0. The lysate was freeze- 
thawed three times and centrifuged to remove the cell debris. The cleared lysate was applied 
to CsCI : step gradients (heavy CsCI at density of 1.45 g/ml. the light CsCI at density of 1.20 
g/ml). ultracentrifuged. and purified using standard techniques [Graham and Prevec (1991) In 
Methods in Molecular Biology. Vol 7. Gene Transfer and Expression Protocols. Murray 
(ed.). Humana Press. Clifton. NJ. pp. 109-128], The concentration of plaque forming units 
(ptu) of this stock was determined at 32°C as described above. Virion DNA was extracted 
from the high titre stock by pronase digestion, phenol-chloroform extraction! and ethanol 
precipitation. The leakiness of this stock was found to be < I in 2000 pfu at the non- 
permissive temperature, consistent with previous repons [Miller and Williams (1987). supra], 

b) Plasmids 

The expression plasmid pRSV-pol [Zhao and Padmanabhan (1988) Cell 55:1005] 
contains sequences encoding the Ad2 polymerase mRNA (including the start codon from the 
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exon at map unit 39) under the transcriptional control ot' the Rous Sarcoma Virus 
LTR/promoter element: the Ad2 DNA polymerase sequences are Hanked on the 3" end hv the 
SV-40 small t intron and SV-40 polvadenvlation addition site (see Fig. 1A). Figure 1A 
provides a schematic representation of the Ad polymerase expression plasmid pRSV-pol. 
pRSV-pol includes the initiator methionine and amino-terminal peptides encoded by the exon 
at m.u. 39 ot" the Ad genome. The location of the PCR primers p602a and p2158c. the two 
.Seal 1 kb probes utilized for Northern analyses, the polymerase terminator codon, and the 
polvadenvlation site of the IVa2 gene (at 1 1.2 m.u. of the Ad genome) are indicated. 

The expression plasmid pRSV-pTP [Zhao and Padmanabhan (1988). supra) contains 
sequences encoding the Ad2 preterminal protein (including the amino terminal peptides 
encoded by the exon at map unit 39 of the Ad genome) under the transcriptional control of 
the Rous Sarcoma Virus LTR/promoter element: the pTP sequences are flanked on the 3" end 
by the SV-40 small-t intron and SV-40 polvadenvlation signals (see Figure IB for a schematic 
of pRSV-PTP). Figure IB also shows the location of the EcoRV subfragment utilized as a 
probe in the genomic DNA and cellular RNA evaluations, as well as the initiator methionine 
codon present from map unit 39 in the Ad5 genome. The locations of the H5wl90 and 
H5.S7/M00 insertions are shown relative to the preterminal protein open-reading frame. The 
following abbreviations are used in Figure 1: ORT. open reading frame: small-t. small tumor 
antigen, and m.u.. map unit. 

pCEP4 is a plasmid containing a hvgromycin expression cassette (Invitrogen). 

pFG140 (Microbix Biosvstems Inc. Toronto. Ontario) is a plasmid containing 
sequences derived from Ad5<//309 which contain a deletion/substitution in the E3 region 
[Jones and Shek (1979) Cell 17:683]. pFGI40 is infectious in single transfections of 293 
cells and is used as a control for transfection efficiency. 



c) Transfection Of 293 Cells 

LP-293 cells were cotransfected with DamH\ linearized pRSV-pol and BamHl 
linearized pCEP4 at a molar ratio of 10:1 using a standard CaP0 4 precipitation method 
[Sambrook et al. (1989) Molecular Cloning: A Laboratory Manual. Cold Spring Harbor 
Press. Plainview. NY. pp. 16.33-16.36], In addition. 293 cells were cotransfected with or with 
BamH\ linearized pRSV-pol. BamYM linearized pRSV-pTP and pCEP using a molar ratio of 
10:1 (non-selectab!e:seIectable plasmids). 



- 24 - 




WO 98/17783 



PCT/US97/19541 



Forty-eight hours after transfection, the cells were passaged into media containing 

hygromycin at 100 pg/mL. Individual hygromvcin resistant colonies were isolated and 
expanded. 



d) Isolation Of Ad Polymerase Expressing 293 Cell Lines 
Twenty hygromycin resistant cell lines were expanded and screened for the ability to 
express the Ad polymerase protein. Initially, the individual cell lines were assayed for the 
ability to support growth of the viral polymerase mutant H5ts36 using the plaque assay 
described in section a) above. It was speculated that constitutive expression of the wild type 
Ad polymerase protein in a clonal population of 293 ceils should allow the growth of H5ts36 
at 38.o°C. However, it was unclear if constitutive expression of the Ad polymerase would be 
toxic when coexpressed with the El proteins present in 293 cells. Similar toxicity problems 
have been observed with the Ad ssDBP. and the pTP [Klessig et al. (1984) Mol. Cell. Biol. 
4:1354 and Schaack et al. (1995) J. Virol. 69:4079]. 

Of the twenty hygromycin resistant cell lines isolated, seven were able to support 
plaque formation with H5ts36 at the non-permissive temperature, unlike the parental LP-293 
cells: these cell lines were named B-6. B-9. C-l. C-4. C-7. C-13 and C-14 (see Table I) (the 
B-6 and B-9 cell lines received only the pRSV-pol and CEP4 plasmids: C-l. C-4. C-7. C-13 
and C-14 cell lines received the pRSV-pol. pRSV-pTP and CEP4 plasmids). For the results 
shown in Table I. dishes (60 mm) of near confluent cells of each cell line were infected with 
the same dilution of H5ts36 at the temperature indicated, overlaid with agar media, and 
stained for plaques as outlined in section a. Passage number refers to the number of serial 
passages after initial transfection with the plasmid pRSV-pol. 

The cell line B-6 produced plaques one day' earlier than the other Ad polvmerase- 
expressing cell lines, which may reflect increased polymerase expression (see below). Cell 
line B-9 demonstrated an increased doubling time whereas each of the other cell lines 
displayed no growth disadvantages relative to the parental LP-293 cells. As shown in Table 
I. even after multiple passages (in some instances up to four months of serial passaging) the 
cells were still capable of H5tsa6 plaque formation at the non-permissive temperature, 
indicating that the RSV-LTR/promoter remained active for extended periods of time. 

However, the ceil lines B-9 and C-13 displayed a decreased ability to plaque the virus at 32°C 
as well as at 38.5°C. suggesting that a global viral complementation defect had occurred in 
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these cell lines after extended passaging. The remaining cell lines screened at later passages 
demonstrated no such defect, even after 20 passages (c.g.. cell tine B-6. Table 1). 

TABLE 1 



Plaquing Ability Of H5ts36 At The Non-Permissive Temperature Utilizing 293-Ad Polymerase Expressing 

Cell Lines 



Cell Line 


Passage Number 


Number Of Plaques At: 


32.0°C 


38.5°C 


293 


- 


-500 


0 


B-6 


9 


>500 


>500 


20 


>500 


>500 


B-9 


5 


>500 


>500 


14 


90 


18 




5 


>500 


500 


C- 1 


13 


>500 


'•500 


C-4 


5 


>500 


>500 


14 


>500 


>500 


C-7 


5 . 


>500 


>500 


14 


>500 


>500 




5 


>500 


>500 


C- 1 3 


27 . 


120 


4 




5 


>500 


>500 


C- 14 


27 


>500 


370 



c) Genomic Analysis Of Ad Polymerase-Expressing Cell Lines 
Genomic DNA from LP-293 cells and each of the seven cell lines able to complement 
H5ts36 at 38.5°C were analyzed by PCR for the presence of pRSV-pol derived sequences. 
Genomic DNA from LP-293 cells and the hygromycin resistant cell lines were harvested 
using standard protocols (Sambrook et al.. supra) and 200 ng of DNA from each cell line was 
analyzed by PCR in a solution containing 2 ng/mL of primers p602a and p2158c. 10 mM 
Tris HCl. pH 8.3. 50 mM K.C1. 1.5 mM MgCL. and 0.001% gelatin. The forward primer. 
p602a [5 ‘ -TTC ATTTT AT GTTT C AGGTT C AGGG-3* (SEQ ID NO:2)] is located in the SV- 
40 polyadenvlation sequence. The reverse primer p2158c [5‘- 

TTACCGCCACACTCGCAGGG-3’ (SEQ ID NO:3)} is Ad-sequence specific with the 5' 
nucleotide located at position 3394 of the Ad 5 genome [numbering according to Doerfler 
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( 1986) Adenovirus D\A. The Viral Genome ami Its Expression. Nijhoff. Boston. MA pp 
95J. 



PCR was Performed with a Perkin Elmer 9600 Thermocycler utilizing the following 
cycling parameters: initial denaturation at 94°C for 3 min. 3 cycles of denaturation at 94°C 
lor 30 sec. annealing at 50°C for 30 sec. and extension at 72°C for 60 sec. followed by 
another 27 cycles with an increased annealing temperature at 56°C. with a final extension at 
72°C tor 10 minutes. PCR products were separated on a 1.0 % agarose gel and visualized 
with ethidium bromide staining (Figure 2). A I kb ladder (Gibco-BRL) was used as a size 
marker, and the plasmid pRSV-pol was used as a positive control. The -750 bp PCR products 
are indicated by an arrow in Figure 2. 



As shown in Figure 2. all cell lines capable of H5ts36 plaque formation at 38.5°C 
contained the Ad pol DNA sequences, whereas the LP-293 ceils did not yield any 
amplification product with these primers. This result demonstrates that each of the selected 

cell lines stably co-integrated not only the hvgromycin resistance plasmid pCEP4. but also 
pRSV-pol. 



0 Complementation Of The Replication Defect Of H5ts36 By 
Ad Polymerase Expressing Cell Lines 

The C to T transition at position 7623 of the H5ts36 genome alters the DNA binding 
affinity of the Ad polymerase protein, rendering it defective for viral replication at non- 
permissive temperatures [Chen et al. (1994) Virology 205:364: Miller and Williams (1987) J. 
Virol. 61:3630: and Wilkie el al. (1973) Virology 51:4991. To analyze the functional activity 
of the Ad polymerase protein expressed by each of the packaging cell lines, a viral 
replication-complementation assay was performed. LP-293 cells or the hvgromycin resistant 
cell lines were seeded onto 60 mM dishes at densities of 2.5-3.0 x lO^per dish, infected with 
H5tsj6 at a multiplicity of infection (MOl) of 10. and incubated for 24 hours at 38.5°C, or 48 
hours at 3_ C. Total DNA was harvested from each plate, then 2 gg of each sample were 
digested with //mdlll. separated on a 1.0% agarose gel. transferred to a nylon membrane, and 
hybridized with P-labeled H5ts36 virion DNA. Densitometric analysis of the 8.010 bp 
//mdlll fragment in each lane was performed on a phophoroimager (Molecular Dynamics) 
utilizing a gel image processing system (IP Lab Version 1.5. Sunnyvale. CA). 
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The resulting autoradiograph is shown in Figure 3. In Figure 3. the lane marked 
"Std.” (standard) contains 1 pg of //mdlll-digested H5ts36 virion DNA. The 8.010 bp 
///mllll fragments analyzed by densitometry are indicated by an arrow. 

As shown in Figure 3. H5ts36 had a diminished ability to replicate in LP-293 cells at 
the non-permissive temperature. In contrast, all seven of the previously selected cell lines 
were able to support replication of H5ts36 virion DNA at 38.5°C to levels approaching those 
occurring in LP-293 cells at 32°C. 

A densitometric analysis of the amount of H5ts36 viral DNA replicated in each of the 
cell lines at permissive and nonpermissive temperatures is presented in Table 2 below. For 
this assay, the relative amounts of the 8.010 bp Hindlll fragment were compared. The 
relative levels of H5ts36 virion DNA replication determined by densitometric analysis of the 
8.010 bp //mdlll fragment isolated from each of the cell line DNA samples. The surface area 
of the 8.010 bp fragment in 293 cells incubated at 38.5°C was designated as L and includes 
some replicated H5ts36 virion DNA. The numbers in each column represent the ratio 
between the densities of the 8.010 bp fragment isolated in the indicated cell line and the 
density of the same band present in LP-293 cells at 38.5°C. 

As shown in Table 2. the levels of replication at the permissive temperature were all 
within four-fold of each other, regardless of which cell line was analyzed, but at the non- 
permissive temperature LP-293 cells reveal the H5ts36 replication defect. The viral bands 
that were present in the LP-293 DNA sample at 38. C represented input virion DNA as well as 
low level replication of H5ts36 DNA. which is generated due to the leakiness of the ts 
mutation at the high MOI utilized in this experiment. The Ad polymerase-expressing cell 
lines were all found to be capable of augmenting 
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TABLE 2 

Densitometric Analysis Of H5ls36 Replication 



Cell Line 


Ratios Of 8.010 bp HindUl Fragment Generated At: 


32.0°C 


38.5°C 


LP-293 


42.6 


1.0 


B-6 


27.3 


57.2 


B-9 


69.0 


15.8 


C- ! 


! 13.6 


34.1 


C-4 


100.2 


65.5 


C-7 


1 14.9 


75.0 


C-13 


43.1 


42.2 


1 C- 14 


46.7 


27.8 



H5ts36 genome replication. Although one cell line (B-9) allowed H5ts36 replication to levels 
only 16 fold greater than LP-293 cells, this was the same cell line that was observed to 
display poor growth properties. Each of the remaining Ad polymerase-expressing cell lines 
allowed substantially greater replication of H5ts36 at non-permissive temperatures, compared 
to LP-293 cells (Table 2). An enhancement of replication up to 75 fold above that of LP-293 
cells was observed with the cell line C-7 at 38.5°C. A substantially more rapid onset of viral 
cytopathic effect in Ad polymerase-expressing cell lines was observed at either temperature. 
These estimates of H5ts_>6 replication-complementation are conservative, since thev have not 
been adjusted for the low level replication of H5ts36 at 38.5°C [Miller and Williams (1987). 
■supra]. The leakiness ol the H5ts36 mutation could potentially be overcome with the use of a 
virus deleted for the Ad polymerase gene. 

g) RNA Analysis Of Ad Polymerase-Expressing Cell Lines 

Total RNA was extracted from each of the cell lines using the RNAzol method 
(Teltest. Inc.. Friendswood. TX 77546: Chomczvnski and Sacchi (1987) Anal. Biochem. 
162:156], Fifteen micrograms of RNA from each cell line was electrophoresed on a 0.8% 
agarose- formaldehyde gel. and transferred to a Nvtran membrane (Schleicher & Schuell) by 
blotting. The filter was UV crosslinked. and analyzed by probing with the two j: P-labeled 1 
kb Sail subfragments of Ad which span positions 6095-8105 of the Ad5 genome (see Figure 
I A). These two .Seal subfragments of the Ad genome are complimentary to the 5 r end of the 
Ad polymerase mRNA. The resulting autoradiograph is shown in Figure 4. In Fiuure 4. the 
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location of the smaller and larger species of Ad polymerase mRNA are indicated relative to 
the 28S and 18S ribosomal RNAs. The aberrant transcript expressed by the B-9 cell line is 
indicated by an arrow. 

The results shown in Figure 4 revealed that the RNA derived from cell lines B-6 and 
5 C-7 contained two species of RNA. estimated to be -4800 and -7000 nt in length, while LP-293 

derived RNA had no detectable hybridization signal. The presence of two polymerase RNA 
species suggests that the polvadenvlation signal of the Ad IVa2 gene (present in the Ad-pol 
construct, see Fig. 1A) is being utilized by the cell RNA processing machinery, in addition to 
the SV-40 polvadenvlation signal. Similar analysis of RNA derived from the cell lines CM. 

10 C-4. C-13. and C-14 also detected the same two transcripts as those detected in the RNA of 

cell lines B-6 and C-7. but at decreased levels, suggesting that even low levels of Ad 
polymerase mRNA expression can allow for the efficient replication of polymerase mutants 
such as H5ts36. The cell line B-6 expressed high levels of polymerase transcript and can 
plaque H5ts36 one day earlier than the other cell lines at 38.5°C. suggesting a causal 
15 relationship. It is interesting to note that the two polymerase transcripts are also detected in 
RNA isolated from cell line B-9. but substantial amounts of a larger RNA transcript (size > 

10 kb) is also present (see Figure 4). The high level production of the aberrant message may 
be related to the increased doubling time previously noted in this cell line. 

20 h) Transfectabilitv Of Ad Polymerase-Expressing Cell Lines 

The ability of Ad polymerase-expressing 293 cell lines to support production of 
H5ts36 virions after transfection with H5ts36 genome DNA was examined as follows. 293 
cells as well as hygromvcin resistant cell lines were grown to near confluency on 60 mm 
dishes and transfected with either 3 pg of purified H5ts36 virion DNA, or with 3.5 pg of the 
25 plasmid pFG140 (Microbix Biosvstems). using the cationic lipid Lipofectamine (Gibco-BRL). 
Cells that received the H5ts36 virion DNA were incubated at 32°C for 14 days, or 38.5°C for 
10 days. The pFG140 transfected cells were incubated at 37.5°C for 10 days. All plates 
were then stained with the neutral red agar overlay and plaques were counted the next day. 

The results are shown in Table 3. 
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TABLE 3 

Transteciion Erficiencv Of Ad pol-E.\pressing Cell Lines 



Cell line 


- * lcu Lines 

Number Of Plaques At: 


32.0°C 


38.5°C 


LP-295 


>500 


0 


B-6 


>500 


>500 


B-9 


n.d. 1 


n.d. 


C-l 


n.d. 


>500 


C-4 


n.d. 


>500 


C-7 


n.d. 


- 

>500 


C-13 


n.d. 


100 


C- 14 


n.d. 


>500 



n.d. - not determined. 



15 



20 



25 



The results shown in Table 3 demonstrated that transfection of H5ts36 DNA at the 
non-permtss.ve temperature allows for ample plaque production in all of the Ad polymerase- 
expressing cell lines tested, unlike the parental LP-293 cells. Cell line C-13 was at passage 
number 29. and demonstrated a somewhat decreased ability to generate plaques at this 
extended passage number. These same cell lines are also capable of producing plaques when 
transfected with the plasmid P FG140. a plasmid capable of producing infectious. El 



dependent Ad upon transfection of the parental 293 cells [Ghosh-Choudhury (1986) Gene 

50:161 1. These observations suggest that the Ad polymerase expressing cell lines should be 

useful lor the production of second generation Ad vectors deleted not only for the El genes. 

but also for the polymerase gene. As shown below in Examples 2 and 3. this is indeed the 
case. 



EXAMPLE 2 

Isolation And Characterization Of Packaging Cell Lines That 
Coexpress The Adenovirus El. DNA Polymerase And Preterminal Proteins 

In Example 1. packaging cell lines coexpressing Ad El and polymerase proteins were 
described. These cell lines were shown to support the replication and growth of H5ts36. an 
Ad with a temperature-sensitive mutation of the Ad polymerase protein. These polymerase- 
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expressing packaging cell lines can be used to prepare Ad vectors deleted for the El and ' 
polymerase functions. In this example. 293 cells cotransfected with both Ad polymerase and 
preterminal protein expression plasmids are characterized. Cell lines co-expressing the Ad El. 
polymerase and preterminal proteins can be used to prepare Ad vectors deleted for the El. 
polymerase and preterminal protein (pTP) functions. 



a) Tissue Culture And Virus Propagation 

The use of LP-293 cells (Microbix Biosvstems Inc.. Toronto). Ad-polymerase 
expressing cell lines, and plaquing efficiency assays of Ad viruses was conducted as described 
10 in Example 1. All cells were maintained in 10% fetal bovine serum supplemented DMEM 
media (GIBCO) in the presence of antibiotics. The virus H5mbl00 [Freimuth and Ginsberg 
(1986) Proc. Natl. Acad. Sci. USA 83:7816] has a temperature sensitive (ts) mutation caused 
by a three base pair insertion within the amino terminus of the preterminal protein, in addition 
to a deletion of the El sequences (see Figure IB). H5.mil 00 was propagated and titred at 
15 32.0°C in LP-293 cells: the leakiness of this stock was less than I per 1000 plaque-forming 

units (pfu) at the nonpermissive temperature of 38.5°C. A lower titer cell lysate contaning 
the virus H5ml90 (which contains a 12 base-pair insertion within the carboxv-terminus of the 
preterminal protein as well as a deletion of the El region, see Figure IB) was provided by Dr. 
P. Freimuth [Freimuth and Ginsberg (1986), supra]. The polymerase and preterminal protein 
20 expressing cell lines were always maintained in media supplemented with hvgromycin 
(Sigma) at 100 pg/mL. 

b) Isolation Of Ad Polymerase And Preterminal Protein 
Expressing 293 Cells 

25 The C-l. C-4. C-7. C-13. and C-14 cell lines (Ex. 1), which had been cotransfected 

with pRSV-pol. pRSV-pTP and CEP4. were screened for presence of pTP sequences and for 
the ability to support the growth of H5ts36 (ts for the Ad-polvmerase), H5inl90. and 
H5.yr/6 1 00 using plaque assays as described in Example 1 . 
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i) Analysis Of Genomic DNA And Cellular RNA 
Cell lines that had received the preterminal protein expression plasmid were screen for 
the presence of pRSV-pTP sequences and El sequences. Total DNA was isolated from the 
LP-293. B-6 (transfected with pRSV-pol only) or C-7 (cotransfected with pRSV-pol and 
pRSV-pTP) cell lines, two micrograms (pg) of each DNA was codigested with the restriction 
enzymes X/rol and Ba/wHI. electrophoreticallv separated in a 0.6% agarose cel. and 
transferred onto a nylon membrane. The membrane was UV crosslinked. probed with both a 
1.8 kb Bln\-Xka\ fragment (spans the El coding region) isolated from the plasmid pFG140. 
and a I.Skb EcoRV ^fragment of Ad serotype 5 (spans the preterminal protein coding 
sequences: see Fig. IB), both of which were random-primer radiolabeled with 3 ’P to a specific 
activity greater than 3.0 x 10 s cpm/pg. The membrane was subsequently exposed to X-ray 
film with enhancement by a fluorescent screen. The resulting autoradiograph is shown in 
Figure 5. 

In Figure 5. the preterminal specific sequences migrated as an -1 1.0 kb DNA fragment 
while the El containing band migrated as a 2.3 kb DNA fragment. No hybridization of either 
probe to DNA isolated from either the LP-293 or B-6 cell lines was observed As shown in 
Figure only the C-7 cell line genomic DNA had preterminal coding sequences, unlike the 
parental LP-293 cells, or the Ad-polymerase expressing B-6 cells. In addition, all cell lines 
had El specific sequences present at nearly equivalent amounts, demonstrating that the 
selection design has not caused the loss of the El sequences originally present in the LP-293 
cells. The results presented in Example 1 demonstrated that both the B-6 and C-7 cell lines 
contain polymerase specific sequences within their genomes, unlike the parental LP-293 cells. 

To confirm that transcription of preterminal protein was occurring, total RNA was 
isolated from each of the cell lines, transferred to nylon membranes, and probed to detect 
preterminal protein-specific mRNA transcripts as follows. Total cellular RNA was isolated 
from the respective cell lines and 15 pg of total RNA from each cell line was transferred to 
nylon membranes. The membranes were probed with the 1.8 kb EcoRV radiolabeled 
subfragment of Ad5 (see Fig. I B) complementary to the preterminal protein coding region. 

The resulting autoradiograph is shown in Figure 6. 

As shown in Figure 6. a single mRNA of the expected size (-3 kb in length) is detected 
only in RNA derived from the C-7 cell line. No hybridization was detected in lanes 
containing RNA derived from the LP-293 or B-6 cell lines. In Example 1. it was 
demonstrated that the C-7 cell line also expresses high levels of the Ad polymerase mRNA. 
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Thus, the C-7 cell line constitutivelv expresses both the Ad polymerase and preterminal 



protein mRjNAs along with El transcripts. 



ii) Plaquing Efficiency Of pTP Mutants On pTP- 
? Expressing Cell Lines 

The C-7 cell line was screened for the ability to transcomplement the growth of 
preterminal mutant viruses. The virus H5ml90 (contains a 12 base pair insertion located 
within the carboxy-terminus of the preterminal protein) has been shown to have a severe 
growth and replication defect, producing less than 10 plaque-forming units per cell [Freimuth 
10 and Ginsberg (1986) Proc. Natl. Acad. Sci. USA 83:7816], The results are summarized in 

Table 4 below. For the results shown in Table 4, LP-293. B-6. or C-7 cells were seeded at a 
density of 2. 0-2. 5 x I0 6 cells per plate. The cells were infected with limiting dilutions of 
lysates derived from the preterminal protein-mutant viruses H5ml90 or H5 j-m 6100. incubated 
at 38.5°C. and plaques counted after six days. As shown in Table 4. onlv the C-7 cell line 
15 could allow efficient plaque formation of H5;'«190 at 38.5°C (the H5/«190 Ivsate used to 

infect the cells was of a low titer, relative to the high titer H5.vi/6100 stock), while both the B- 
6 and C-7 cell lines had nearly equivalent plaquing efficiencies when H5.vi/6100 was utilized 
as the infecting virus. 



TABLE 4 

20 Plaquing Efficiency Of Preterminal Protein-Mutant Viruses 





Mutation 


Plaque Titres (pfu/ml) 




Location 


LP-293 


B-6 


C-7 


H5//7190 


carboxv-terminus 


<1 x I0 : 


<1 x 10 : 


1.4 x 10" 


H5sub 100 


amino-terminus 


<1 x I0 J 


9.0 x 10" 


.4.5 x 10“ 



25 As shown in Table 4. when equivalent dilutions of H5/nl90 were utilized, the plaquing 

efficiency of the C-7 cell line was at least 100-fold greater than that of the B-6 or LP-293 
cells. This result demonstrated that the C-7 cell line produces a functional preterminal 
protein, capable of trans-complementing the defect of the H5/«190 derived preterminal 
protein. 

20 The cell lines were next screened for the ability to trans-complement with the 

temperature-sensitive virus. H5sw6l00. at nonpermissive temperatures. H5iu6100 has a codon 
insertion mutation within the amino-terminus of the preterminal protein, as well as an El 
deletion. The mutation is responsible both for a temperature sensitive growth defect, as well 
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as a replication defect [Freimuth and Ginsberg (1986). supra and Schaack et al. (1995) J. 

Virol. 69:4079], The plaquing efficiency of the cell line C-7 was found to be at least 1000 
fold greater than that ot the LP-293 cells (at non-permissive temperatures) (see Table 4). 
Interestingly, the cell line B-6 was also capable of producing large numbers of H55t/M00 
derived plaques at 38.5°C, even though it does not express any preterminal protein. This 
result suggested that the high level expression of the polymerase protein was allowing plaque 
formation of H5jn/6100 in the B-6 cell line. To examine this possibility, the nature of 
Ha.vuMOO growth in the various cell lines was examined. 

c) Complementation Of The Replication And Growth Defects 
Of H5 sm6100 

The cell lines B-6 and C-7 were shown to overcome the replication defect of H5ts36 
(Ex. 1 ). Since the preterminal and polymerase proteins are known to physically interact with 
each other [Zhao and Padmanabhan (1988) Cel! 55:1005], we investigated whether the 
expression ot the Ad-polymerase could overcome the replication defect of H5.n/£100 at non- 
permissive temperatures using the following replication-complementation assay. 

LP-293. B-6. or C-7 cells were seeded onto 60 mM dishes at a density of 2x1 0 6 cells 
per dish and infected the next day with H5swM00 at a multiplicity of infection (MOI) of 0.25, 
and incubated at 38.5°C for 16 hours or 32.0°C for 40 hours. The cells from each infected 
plate were then harvested and total DNA extracted as described in Example I. Four 
microgrants of each DNA sample was digested with ///ndlll. electrophoresed through a 0.7% 
agarose gel. transferred to a nylon membrane, and probed with ;: P-labeled H5ts36 virion 
DNA. The resulting autoradiograph is shown in Figure 7. As seen in Figure 7. the 
HSsub 1 00 replication defect when grown in LP-293 cells at 38.5°C is seen: this defect is not 
present when the virus is grown at the same temperature in either B-6 or C-7 cells. 

The results depicted in Figure 7 demonstrates that both cell lines B-6 and C-7 could 
trans-complement the replication defect of H5.m/>100. This result demonstrated that the 
expression of the Ad polymerase in B-6 cells was able to overcome the preterminal protein- 
mediated replication defect ot H5s - w/>100. While not limiting the present invention to any 
particular mechanism, the ability of Ad polymerase to overcome the preterminal protein- 
mediated replication defect of H5iw6IOO may be due to a direct physical interaction of the 
polymerase with the amino-terminus ot the H5iu6100-derived preterminal protein. 
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In support of this hypothesis, it has also been demonstrated that the H5.suM00 
replication defect can be overcome when LP-293 cells were infected with a 100-fold greater 
amount of H5iu6100. However, complementation of the H5i»M00 replication defect is not 
sufficient to overcome the growth defect of H5 aw 6100. since temperature shift-up experiments 
have demonstrated that the H5i«6100 growth defect is not dependent upon viral replication 
[Schaack et al. (1995). supra). Therefore, the overexpression of the Ad-polymerase must 
have allowed a very low level but detectable production of infectious H5.vu6100 particles in 
the B-6 cell line. The reduced growth of H5 .shZ> 100 is therefore not due to a replication 
defect, but rather some other critical activity that the preterminal protein has a role in. such as 
augmention of viral transcription by association with the nuclear matrix [Schaack and Shertk 
(1989) Curr. Top. Microbiol. Immunol. 144:185 and Hauser and Chamberlain (1996) J. Endo. 
149:373]. This was confirmed by assessing the ability of the C-7 cell line to overcome the 
growth defect of H5 jh 6I00 utilizing one-step growth assays performed as follows. 

Each of the cell lines (LP-293. B-6 and C-7) were seeded onto 60 mm dishes at 2.0 x 
10' 1 cells/dish. The cell lines were infected at an MOI of 4 with each of the appropriate 
viruses (wtAd5. H5ts36. or H5 a// 6100). and incubated at 38.5°C for 40 hours. The total 
amount of infectious virions produced in each 60 mm dish was released from the cell lysates 
by three cycles of freeze-thawing, and the titer was then determined by limiting dilution and 
plaque assay on B-6 cells at 38.5°C. The results are summarized in Figure 8. 

The results shown in Figure 8. demonstrated that even though the B-6 cell line allowed 
normal replication and plaque formation of H5su6IOO at 38.5°C (in fact. B-6 cells were 
utilized to determine the plaque hires depicted in Figure 8) they could not allow high level 
growth of H5 .vm 6100 and only produced titres of H5 jz/ 6I00 equivalent to that produced by the 
LP-293 cells. The C-7 cell line produced 100 fold more virus than the LP-293 or B-6 cells, 
see Figure 8. Encouragingly, the titre of H5 .vmZ» 100 produced bv the C-7 cells approached 
titres produced by LP-293 cells infected with wild-type virus. Ad5. When the H5 juZj 100 
virions produced from infection of the C-7 cells were used to infect LP-293 cells at 38.5°C. 
all virus produced retained the ts mutation (/. e.. at least a 1000 fold drop in pfti was detected 
when LP-293 cells were respectively infected at 38.5°C vs. 32.0°C). This finding effectively 
rules out the theoretical possibility that the HSswMOO input virus genomes recombined with 
the preterminal protein sequences present in the C-7 cells. 

In addition, the C-7 cell line allowed the high level growth of H5ts36. demonstrating 
that adequate amounts of the Ad-polvmerase protein were also being expressed. The C-7 cell 
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line was capable of trans-complementing the growth of both H5ts36 and H5«/Z»100 after 4 

months of serial passaging, demonstrat.ng that the coexpression of the E 1 . preterm, nal. and 
polymerase proteins was not toxic. 

These results demonstrate that the constitutive expression of both the polvmerase and 
preterminal proteins is not detrimental to normal virus production, which might have occurred 
if one or both of the proteins had to be expressed only during a narrow time period during the 
Ad life cycle. In summary, these results demonstrated that the C-7 cell line can be used as a 

packaging cell line to allow the high level growth of El. preterminal, and polymerase deleted 
Ad vectors. 



EXAMPLE 3 

Production Of Adenovirus Vectors 
Deleted For El And Polvmerase Functions 



In order to produce an Ad vectors deleted for El and polvmerase functions, a small, 
frame-shifting deletion was introduced into the Ad-pol gene contained within an El -deleted 
Ad genome. The plasmid pBHGl I (Microbix) was used as the source of an El -deleted Ad 
genome. pBHGl 1 contains a deletion of Ad5 sequences from bp 188 to bp 1339 (0.5-3.7 
m.u.): this deletion removes the packaging signals as well as El sequences. pBHGl 1 also 
contains a large deletion within the E3 region (bp 27865 to bp 30995: 77.5-86.2 m.u.). The 
nucleotide sequence of pBHGl 1 is listed in SEQ ID NO:4 [for cross-corrleation between the 
pBHGl 1 sequence and the Ad5 genome (SEQ ID NO: I). it is noted that nucleotide 8.773 in 
pBHGl 1 is equivalent to nucleotide 7.269 in Ad5], 

pBHGl 1 was chosen to provide the Ad backbone because this plasmid contains a large 
deletion within the E3 region (77.5 to 86.2 m.u.) and therefore vectors derived from this 
plasmid permit the insertion of large pieces of foreign DNA. A large cloning capacity is 
important when the poE vectors is to be used to transfer a large gene such as the dystrophin 
gene (cDNA = 13.6 kb). However, the majority of genes are not this large and therefore 
other Ad backbones containing smaller deletions within the E3 region (c.g.. pBHGlO which 
contains a deletion between 78.3 to 85.8 m.u.; Microbix) may be employed for the 
construction of poE vectors using the strategy outlined below. 
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a) Construction Of A Plasmid Containing A Portion Of The 
Adenovirus Genome Containing The Polymerase Gene 
A fragment of the Ad genome containing the pol gene located on pBHGl 1 was 
subcloned to create pBSA-XB. Due to the large size of the Ad genome, this intermediate 
plasmid was constructed to facilitate the introduction of a deletion within the pol gene the pol 
deletion. pBSA-XB was constructed as follows. The polylinker region of pBluescript 
(Stratagene) was modified to include additional restriction enzyme recognition sites (the 
sequence of the modified polylinker is provided in SEQ ID NO:5: the remainder of 
pBluescript was not altered); the resulting plasmid was termed pBSX. pBHGl 1 was digested 
with .Y&al and Bam\\\ and the 20.223 kb fragment containing the pol and pTP coding regions 
(E2b region) was inserted into pBSX digested with Xba\ and BamW\ to generate pBSA-XB. 



b) Construction Of pBHGl lApol 

A deletion was introduced into the pol coding region contained within pBSA-XB in 
15 such a manner that other key viral elements were not disturbed (e.g., the major late promoter, 
the tripartite leader sequences, the pTP gene and other leader sequences critical for normal 
virus viability). The deletion of the pol sequences was carried out as follows. pBSA-XB was 
digested with BspEl and the ends were filled in using T4 DNA polymerase. The BspEl- 
digested. T4 polymerase filled DNA was then digested with BamYW and the 8809 bp 
20 jSo/nHl/Zfy/jE^ filled) fragment was isolated as follows. The treated DNA was run on a 0.6% 
agarose gel (TAE buffer) and the 8809 bp fragment was excised from the gel and purified 
using a QIAEX Gel Extraction Kit according to the manufacturer's instructions (OIAGEN. 
Chatsworth. CA). 

A second aliquot of pBSA-XB DNA was digested with BspH\ and the ends were filled 
25 in with T4 DNA polymerase. The B.vpHI -digested. T4 polymerase filled DNA was then 
digested with BumHl and the 13.679 bp 5a/wHI/5.ypHI(filled) fragment was isolated as 
described above. 

The purified 8809 bp ZtamHI/RspEKfilled) fragment and the purified 13.679 bp 
5a/MHI/Z?.y/7HI(filled) fragment were ligated to generate pApoL pApol contains a 612 bp 
30 deletion within the pol gene (bp 8772 to 9385; numbering relative to that of pBHGl 1) and 
lacks the 1 1.4 kb BcimHX fragment containing the right arm of the Ad genome found within 
• pBHGl 1. 
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To provide the right arm of the Ad genome. pApol was digested with BamH\ followed 

by treatment with calf intestinal alkaline phosphatase. pBHGl I was digested with BamH\ and 
the I 14 kb fragment was isolated and purified using a QIAEX Gel Extraction Kit as 
described above. The purified 1 1.4 kb BamHl fragment was ligated to the 
ZtamHI/phosphatased pApol to generate pBHGl lApol. Proper construction of pBHGl 1 Apol 
was confirmed restriction digestion (///ndlll). 

c) Rescue And Propagation Of Ad5ApoI Virus 

The Ad genome contained within pBHGl lApol lacks the packaging signals. 

Therefore, in order to recover virus containing the 612 bp deletion within the pol gene from 
pBHGl lApol. this plasmid must be cotransfected into packaging cells along with DNA that 
provides a source ot the Ad packaging signals. The Ad packaging signals may be provided 
by wild-t\pe or mutant Ad viral DNA or alternatively may be provided using a shuttle vector 
which contains the lett-end Ad5 sequences including the packaging signals such as 
pAElspl A. pAElsplB (Microbix) or pAdBglll (pAdBglll is a standard shuttle vector which 
contains 0-1 m.u. and 9-16 m.u. ot the adenovirus genome). 

To rescue virus. pBHGl lApol was co-transfected with Ad5t//7001 viral DNA . 
Ad5<//700l contains a deletion in the E3 region: the E3 deletion contained within Ad5c//7001 
is smaller than the deletion contained within the Ad genome contained within pBHGl 1. It is 
not necessary that Ad5c//7001 be used to recover virus; other adenoviruses, including wild- 
tvpe adenoviruses. may be used to rescue of virus from pBHGl lApol. 

It has been reported that the generation of recombinant Ads is more efficient if Ad 
DNA-terminal protein complex (TPC) is employed in conjunction with a plasmid containing 
the desired deletion [Miyake at al. (1996) Proc. Natl. Acad. Sci. USA 93:1320]. Accordingly, 

. Ad5dl700i-TPC were prepared as described [Miyake at al. (1996). supra). Briefly, purified 
Ad5t//7001 virions (purified through an isopycnic CsCl gradient centered at 1.34 g/ml) were 
lysed by the addition of an equal volume of 8 M guanidine hydrochloride. The released 
Ad5r//7001 DNA-TPC was then purified through a buoyant density gradient of 2.8 M CsCI/4 
M guanidine hydrochloride by centrifugation for 16 hr at 55.000 rpm in a VTi65 rotor 
(Beckman). Gradient fractions containing Ad5a77001 DNA-TPC were identified using an 
ethidium bromide spot test and then pooled, dialyzed extensively against TE buffer. BSA was 
then added to a final concentration ot 0.5 mg/ml and aliquots were stored at -80°C. The 
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Ad5c//700l DNA-TPC was then digested with Seal and then gel-filtered through a Sephadex 
G-50 spin column. 

One hundred nanograms of the digested Ad5oY7001 DNA-TPC was mixed with 5pg 
of pBHGl 1 Apol and used to transfect pol-expressing 293 cells (i.e. . C-7). Approximately 10 
days post-transfection, plaques were picked. The recombinant viruses were plaque purified 
and propagated using standard techniques (Graham and Prevac. supra). Viral DNA was 
isolated, digested with restriction enzymes and subjected to Southern blotting analysis to 
determine the organization of the recovered viruses. Two forms of virus containing the 612 
bp pot deletion were recovered and termed Ad5ApolAE3I and Ad5ApolAE31I. One form of 
recombinant pol’ virus recovered. Ad5ApolAE31. underwent a double recombination event 
with the Ad5J/7001 sequences and contains the E3 deletion contained within Ad5<//7001 at 
the right end of the genome. The second form of recombinant pol~ virus. Ad5ApolAE3II. 
retained pBHGl 1 sequences at the right end of the genome (i.e.. contained the E3 deletion 
found within pBHGl 1). These results demonstrate the production of a recombinant Ad vector 
containing a deletion within the pol gene. 



d) Characterization Of The E1+, pol- Viruses 

To demonstrate that the deletion contained within these two pol“ viruses renders the 
virus incapable of producing functional polymerase. Ad5ApoIAE3I was used to infect 293 
20 cells and pol-expressing 293 cells (B-6 and C-7 cell lines) and a viral replication- 

complementation assay was performed as described in Example le. Briefly. 293. B-6 and C-7 
cells were seeded onto 60 mm dishes at a density of 2 x 106 cells/dish and infected with 
H5.yt/A100 or Ad5ApolAE3I at an MOI of 0.25. The infected cells were then incubated at 
37°C or 38.5°C for 16 hours, or at 32.0°C for 40 hours. Cells from each infected plate were 
25 then harvested and total DNA was extracted. Four micrograms of each DNA sample was 
digested with ///mill!, electrophoresised through an agarose gel. transferred to a nylon 
membrane and probed with "P-labeled adenoviral DNA. The resulting autograph is shown in 
Figure 9. In Figure 9. each panel shows, from left to right. DNA extracted from 293. B-6 
and C-7 cells, respectively infected with either H5.rw6100 or Ad5ApolAE3I (labeled 
30 Ad5APOL in Fig. 9). 

As shown in Figure 9. the recombinant pol* virus was found to be viable on pol- 
expressing 293 cells but not on 293 cells. These results demonstrates that recombinant Ad 
viruses containing the 612 bp deletion found within pApol lack the ability to express Ad 
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polymerase. These results also demonstrate that B-6 and C-7 cells efficiently complement the 
Apol found within Ad5ApolAE3l arid Ad5ApolAE3II. In addition, these results show that 
replication ot the ts pTP mutant H5.W&100 can be complemented by high level expression of 
the Ad polymerase with or without co-expression of pTP. 

Because the expression of early genes is required for the expression of the late gene 
products, the ability of recombinant viruses contaning the Apol deletion to direct the 
expression of late gene products was examined. 293 and C-7 cells were infected with 
AdoApolAEjI and 24 hours after infection cell extracts were prepared. The cell extracts were 
serially diluted and examined for the expression of the fiber protein (a late gene product) by 
immunoblot analysis. The immunoblot was performed as described in Example 3C with the 
exception that the primary antibody used was an anti-fiber antibody (FIBER-KNOB obtained 
from Robert Gerard. University of Texas Southwestern Medical School). The results of this 
immunoblotting analysis revealed that no fiber protein was detected from 293 cells infected 
with Ad5ApolAE3I (at any dilutuion of the cell extract). In contrast, even a 1:1000 dilution 
of cell extract prepared from C-7 cells infected with Ad5ApolAE3I produced a visible band 
on the immunoblot. Therefore, the pol deletion contained within Ad5ApolAE3I resulted in a 
greater than 1000-fold decrease in fiber production. 

The above results demonstrate that polymerase gene sequences can be deleted from the 
virus and that the resulting deleted virus will only grow on cells producing Ad-polymerase in 
tram. Using the pol-expressing cell lines described herein (e.g.. B-6 and C-7). large 
quantities of the pol’ viruses can be prepared. A dramatic shut-down in growth and late gene 

expression is seen when cells which do not express Ad polymerase are infected with the pol - 
viruses. 



e) Generation Of El", Pol" Ad Vectors 

Ad5r//7001 used above to recuse virus containing the polymerase deletion is an El- 
containing virus. The presence of El sequences on the recombinant pol" viruses is 
undesirable when the recombinant virus is to be used to transfer genes into the tissues of 
animals: the El region encodes the transforming genes and such viruses replicate extremely 

well m vivo leading to an immune response directed against cells infected with the El- 
containing virus. 

El viruses containing the above-described polymerase deletion are generated as 
follows. pBHGl I Apol is cotransfected into pol-expressing 293 cells (e.g.. B-6 or C-7) alone 
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with a shuttle vector containing the left-end Ad5 sequences including the packaging signals. 
Suitable shuttle vectors include pAElsplA (Microbix). pAElsplB (Microbix) or pAdBglll. 

The gene of interest is inserted into the polylinker region of the shuttle vector and this 
plasmid is then cotransfected into B-6 or C-7 cells along with pBHGl 1 Apol to generate a 
5 recombinant El", pol' Ad vector containing the gene of interest. 

EXAMPLE 4 

Production Of Adenovirus Vectors 
Deleted For El And Preterminal Protein Functions 

10 

In order to produce an Ad vectors deleted for El and preterminal protein functions, a 
small deletion was introduced into the Ad preterminal protein (pTP) gene contained within an 
El -deleted Ad genome. The plasmid pBHGl I (Microbix) was used as the source of an El- 
deleted Ad genome to maximize the cloning capacity of the resulting pTP" vector. However. 

15 other Ad backbones containing smaller deletions within the E3 region (c.g.. pBHGlO which 
contains a deletion between 78.3 to 85.8 m.u.; Microbix) may be employed for the 
construction of pTP - vectors using the strategy outlined below. 

a) Construction Of pApTP 

20 A deletion w’as introduced into the pTP coding region contained within pBSA-XB (Ex. 

3) in such a manner that other key viral elements were not disturbed (e. v g.. the tripartite leader 
sequences, the i-leader sequences, the VA-RNA 1 and II genes, the 55 kD gene and the pol 
gene). The deletion of the pTP sequences was carried out as follows. pBSA-XB was 
digested with Xbcii and EcoRV and the 7.875 kb fragment was isolated as described (Ex. 3). 

25 Another aliquot of pBSA-XB was digested with Muni and the ends were filled in using T4 
DNA polymerase. The I -digested, T4 polymerase filled DNA was then digested with 

Xbal and the 14.894 kb Xbal/MunH filled) fragment was isolated as described (Ex. 3). The 
7.875 kb Muni fragment and the 14.894 kb Xbal/MunH filled) fragment were ligated together 
to generate pApTP. 

30 



- 42 - 




WO 98/17783 



PCT/US97/1 954 1 



b) Construction Of pBHGl lApTP 

pApol contains a 429 bp deletion within the pTP gene (bp 10.705 to 1 1.134; 
numbering relative to that of pBHGl 1) and lacks the 1 1.4 kb DamHl fragment containing the 
right arm of the Ad genome found within pBHGl 1. 

To provide the right arm of the Ad genome. pApTP was digested with BamHI 
followed by treatment with shrimp alkaline phosphatase (SAP; U.S. Biochemicals. Cleveland. 
OH). pBHGl 1 was digested with BamHl and the 1 1.4 kb fragment was isolated and purified 
using a QIAEX Gel Extraction Kit as described (Ex. 3). The purified 1 1 .4 kb BamHl 
fragment was ligated to the BamHI/phosphatased pApTP to generate pBHGl lApTP. Proper 
construction ot pBHGl lApTP was confirmed restriction digestion. 

c) Rescue And Propagation Of Ad Vectors Containing The pTP 
Deletion 

The Ad genome contained within pBHGl lApTP lacks the Ad packaging signals. 
Therefore, in order to recover virus containing the 612 bp deletion within the pol gene from 
pBHGl lApol. this plasmid must be cotransfected into packaging cells along with DNA that 
provides a source of the Ad packaging signals. The Ad packaging signals may be provided 
by wild-type or mutant Ad viral DNA or alternatively may be provided using a shuttle vector 
which contains the left-end Ad5 sequences including the packaging signals such as 
pAElsplA. pAElsplB (Microbix) or pAdBglll. 

Recombinant Ad vectors containing the pTP deletion which contain a deletion within 
the Ej region are generated by cotransfection ot pBHGl lApTP (the gene of interest is 
inserted into the unique Pad site of pBHGl lApTP) with a E3-deleted Ad virus such as 
Ad5dl700 into pTP-expressing 293 cells (e g.. C-7); viral DNA-TPC are utilized as described 
above in Example 3. 

Recombinant vectors containing the pTP deletion which also contain deletions within 
the El and E3 regions are generated by cotransfection of pBHGl lApTP into pTP-expressing 
293 cells (e.g.. C-7) along with a shuttle vector containing the left-end Ad5 sequences 
including the packaging signals. Suitable shuttle vectors include pAElsplA (Microbix), 
pAElsplB (Microbix) or pAdBglll. The gene of interest is inserted into the polylinker recion 
of the shuttle vector and this plasmid is then cotransfected into B-6 or C-7 cells along with 
pBHGl lApTP to generate a recombinant El". pTP" Ad vector containing the gene of 
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Production Df Adenovirus Vectors Deleted For 
El. Polymerase And Preterminal Protein Functions 



5 In order to produce an Ad vectors deleted for El. polymerase and preterminal protein 

functions, a deletion encompassing pol and pTP gene sequences was introduced into the Ad 
sequences contained within an El -deleted Ad genome. The plasmid pBHGl 1AE4 was used 
as the source of an El -deleted Ad genome to maximize the cloning capacity of the resulting 
pol'. pTP~ vector. pBHGl IAE4 is a modified form of BHG1 1 which contains a deletion of 
10 all E4 genes except for the E4 ORF 6; the E4 region was deleted to create more room for the 
insertion of a gene of interest and to further disable the virus. However, other E 1 ' Ad 
backbones, such as pBHGU and pBHGl (Microbix; pBHGlO contains a smaller deletion 
within the E3 region as compared to pBHGll). may be employed for the construction of 
pol". pTP' vectors using the strategy outlined below. 

15 Due to the complexity of the cloning steps required to introduce a 2.3 kb deletion that 

removes portions of both the pTP and pol genes, this deletion was generated using several 
steps as detailed below. 



a) Construction Of pAXBApolApTPVARNA+tl3 

20 In order to create a plasmid containing Ad sequences that have a deletion within the 

pol and pTP genes. pAXBApolApTPVARNA+tl3 was constructed as follows. 

pBSA-XB was digested with BspEl and the 18 kb fragment was isolated and 
recircularized to create pAXBApolApTP; this plasmid contains a deletion of the sequences 
contained between the BspEl sites located at 8.773 and 12.513 (numbering relative to 
25 pBHGl 1). 

A fragment encoding the VA-RNA3 sequence and the third leader of the tri-partite 
leader sequence was prepared using the PCR as follows. The PCR was carried out in a 
solution containing H5ts36 virion DNA (any Ad DNA. including wild-type Ad. may be used). 
2 ng/rnL of primers 4005E and 4006E. 10 mM Tris HCl. pH 8.3. 50 mM K.C1. 1.5 mM 
30 MgCI : . 0.001% gelatin and Pfu polymerase. The forward primer. 4005E. [5‘- 

TGCCGCAGCACCGGATGCATC-3' (SEQ ID NO:6)] contains sequences complementary to 
residues 12.551 to 12.571 of pBHGl 1 (SEQ ID NO:4). The reverse primer. 4006E. [5 ? - 
GCGTCCGGAGGCTGCCATG CGGCAGGG-3" (SEQ ID NO:7)] is complementary to 
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residues 11.091 to 11.108 of pBHGl 1 (SEQ ID NO:4) as well as a BspEl she (underlined). 
The predicted sequence of the -1.6 kb PCR product is listed in SEQ ID NO:8. 



PCR was performed with a Perkin Elmer 9600 Thermocycler utilizing the following 
cycling parameters: initial denaturation at 94°C for 3 min. 3 cycles of denaturation at 94°C 
for 30 sec. annealing at 50°C for 30 sec. and extension at 72°C for 60 sec. followed by 
another 27 cycles with an increased annealing temperature at 56°C. with a final extension at 

72°C for 10 minutes. The -1.6 kb PCR product was purified using a QIAEX Gel Extraction 
Kit as described (Ex. 3). 

The purified PCR fragment was then digested with BspEl. pAXBApolApTP was 
digested with BspEl followed by treatment with SAP. The RvpEI-digested PCR fragment and 
the 5.vpEI-SAP-treated pAXBApolApTP were ligated together to create 



pAXBApol ApTPV ARN A+t 1 3. 



b) Construction Of pBHGllApolApTPVARNA+tl3 

pAXBApolApTPVARNA+tl3 contains a 2.3 kb deletion within the pol and pTP genes 
(bp 8.773 to 1 1.091: numbering relative to that of pBHGl I) and lacks the I 1.4 kb BamHl 
fragment containing the right arm of the Ad genome. 

To provide the right arm of the Ad genome, pAXBApolApTPVARNA+tl3 was 
digested with BamHl followed by treatment with SAP. pBHGl 1AE4 (pBHGl 1 or pBHGlO 
may be used in place of pBHGl 1AE4) was digested with BamHl and the 1 1.4 kb fragment 
was isolated and purified using a QIAEX Gel Extraction Kit as described (Ex. 3). The 
purified 1 1.4 kb BamHl fragment was ligated to the 5a;nHI/phosphatased 
pAXBApol ApTPV ARN A+t 1 3 to generate pBHGl 1 ApolApTPVARNA+t 13. Proper 
construction of pBHGl lApolApTPVARNA+tl3 was confirmed restriction digestion. 

c) Rescue And Propagation Of Ad Vectors Containing The pol, 
pTP Double Deletion 

The Ad genome contained within pBHGl 1 ApolApTPVARNA+tl3 lacks the Ad 
packaging signals. Therefore, in order to recover virus containing the 2.3 kb deletion within 
the pol and pTP genes from pBHGl 1 ApolApTPVARNA+tl3. this plasmid must be 
cotransfected into packaging cells along with DNA that provides a source of the Ad 
packaging signals. The Ad packaging signals may be provided by wild-tvpe or mutant Ad 
viral DNA or alternatively may be provided using a shuttle vector which contains the left-end 
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Ad5 sequences including the packaging signals such as pAElsplA. pAElsplB (Microbix) or 
pAdBglll. 



Recombinant vAd vectors containing the pol. pTP double deletion which also contain a 
deletion within the E3 region are generated by cotransfection of 
5 pBHGl I ApolApTPVARNA*Hl3 (the gene of interest is inserted into the unique Pad site of 
pBHGl 1 ApolApTPVARNA+tl3) with a E3-deleted Ad virus such as Ad5cf/7001 into pol- and 
pTP-expressing 293 cells ( e.g .. C-7); viral DNA-TPC are utilized as described (Ex. 3). 

Recombinant vectors containing the pol. pTP double deletion which also contain 
deletions within the El and E3 regions are generated by cotransfection of 
10 pBHGl 1 ApolApTPVARNA-Hl3 into pol- and pTP-expressing 293 cells (e.g.. C-7) along with 
a shuttle vector containing the left-end Ad5 sequences including the packaging signals. 

Suitable shuttle vectors include pAElsplA (Microbix), pAElsplB (Microbix) or pAdBglll. 

The gene of interest is inserted into the polvlinker region of the shuttle vector and this 
plasmid is then cotransfected into B-6 or C-7 cells along with 
15 pBHGl lApolApTPVARNA+tl3 to generate a recombinant El". pol ", pTP“ Ad vector 
containing the gene of interest. 

EXAMPLE 6 

Encapsidated Adenovirus Minichromosomes 
20 Containing A Full Length Dystrophin cDNA 

In this Example, the construction of an encapsidated adenovirus minichromosome 
(EAM) consisting of an infectious encapsidated linear genome containing Ad origins of 
replication, packaging signal elements, a p-galactosidase reporter gene cassette and a full 
25 length (14 kb) dystrophin cDNA regulated by a muscle specific enhancer/promoter is 

described. EAMs are generated by cotransfecting 293 cells with supercoiled plasmid DNA 
(pAd5(3dys) containing an embedded inverted origin of replication (and the remaining above 
elements) together with linear DNA from El -deleted virions expressing human placental 
alkaline phosphatase (hpAP). All proteins necessary for the generation of EAMs are provided 
30 in irons from the hpAP virions and the two can be separated from each other on equilibrium 
CsCl gradients. These EAMs are useful for gene transfer to a variety of cell types both in 
vitro and in vivo. 
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a) Generation And Propagation Of Encapsidated Adenovirus 
iVlinichromosomes 

To establish a vector system capable of delivering full length dystrophin cDNA clones, 
the minimal region ol Ad5 needed lor replication and packaging was combined with a 
conventional plasmid carrying both dystrophin and a p-galactosidase reporter gene. In the 
tirsi vector constructed, these elements were arranged such that the viral ITRs Hanked (ITRs 
facing outward) the reporter gene and the dystrophin gene [ i.c.. the vector contained from 5‘ 
to j> the right or 3 1TR (mu 100 to 99). the dystrophin gene and the reporter gene and the 
left or :> ITR and packaging sequnce (mu 1 to 0)]. Upon introduction of this vector along 
with helper virus into 293 cells, no encapsidated adenovirus minichromosomes were 
recovered. 

The second and successful vector. P Ad5pdys (Fig. 10) contains 2. 1 kb of adenovirus 
DNA. together with a 14 kb murine dystrophin cDNA under the control of the mouse muscle 
creatine kinase enhancer/promoter, as well as a p-galactosidase gene regulated by the human 
cytomegalovirus enhancer/promoter. 

Figure 10 shows the structure of P Ad5pdys (27.8 kb). The two inverted adenovirus 
origins ot replication are represented by a left and right inverted terminal repeat 
(LITR/RJTR). Replication from these ITRs generates a linear genome whose termini 
correspond to the 0 and 100 map unit (mu) locations. Orientation of the origin with respect 
to wild type adenovirus serotype 5 sequences in mu is indicated above the figure ( 1 mu=360 
bp). Encapsidation ot the mature linear genome is enabled by five (Al-AV) packaging signals 
('T). The E. toll P-galactosidase and mus tnusculus dystrophin cDNAs are reculated by 
cytomegalovirus (CMV) and muscle creatine kinase (MCK) enhancer/promoter elements. 

. respectively. Both expression cassettes contain the SV40 polvadenvlation (pA) signal. Since 
: the El A enhancer/promoter overlaps with the packaging signals. pAdSpdys was engineered 
such that RNA polymerase transcribing from the El A enhancer/promoter will encounter the 
SV40 late polvadenvlation signal. Pertinent restriction sites used in constructing pAd5pdvs 
are indicated below the figure. An adenovirus fragment corresponding to mu 6.97 to 7.77 
was introduced into pAdSpdys during the cloning procedure (described below). PI and P2 
represent location of probes used for Southern blot analysis. Restriction sites destroyed 
during the cloning of the Ad5 origin of replication and packaging signal are indicated in 
parentheses. 
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pAd5(3dvs was constructed as follows. pBSX (Ex. 3a) was used as the backbone for 
construction of pAdSPdvs. The inverted fused Ad5 origin of replication and five 
encapsidation signals were excised as a Pstl/Xbal fragment from pAdSori. a plasmid 
containing the 6 kb //mdlll fragment from pFG140 (Microbix Biosvstems). This strategy also 
5 introduced a 290 bp fragment from Ad5 corresponding to map units 6.97 to 7.77 adjacent to 
the right inverted repeat (see Fig. 10). The E. coli (3-galactosidase gene regulated by the 
human CMV immediate early (IE) promoter/enhancer expression cassette was derived as an 
EcoRl/Hindlll fragment from pCMVp [MacGregor and Caskey (1989) Nucleic Acids Res. 

1 7:2365] : the CMV IE enhancer/promoter is available from a number of suppliers as is the E. 

10 coli P-galactosidase gene]. The murine dystrophin expression cassette was derived as a 
BssHll fragment from pCVAA. and contains a 3.3 kb MCK promoter/enhancer element 
[Phelps el al. (1995) Hum. Mol. Genet. 4:1251 and Jaynes el al. (1986) Mol. Cell. Biol. 
6:2855]. The sequence of the -3.3 kb MCK promoter/enhancer element is provided in SEQ ID 
NO:9 (in SEQ ID NO:9. the last nucleotide of SEQ ID NO:9 corresponds to nucleotide +7 of 
15 the MCK gene). The enhancer element is contained within a 206 bp fragment located -1256 
and -1050 upstream of the transcription start site in the MCK gene. 

It was hoped that the pAdSpdys plasmid would be packageable into an encapsidated 
minichromosome when grown in parallel with an El -deleted virus due to the inclusion of both 
inverted terminal repeats (ITR) and the major Ad packaging signals in the plasmid. The ITRs 
20 and packaging signals were derived from pFG140 (Microbix). a plasmid that generates El- 
defective Ad particles upon transfection of human 293 cells. 

hpAP is an El deleted Ad5 containing the human placental alkaline phosphatase gene 
[Muller cr al. (1994) Circ. Res. 75:1039], This virus was chosen to provide the helper 
functions so that it would be possible to monitor the titer of the helper virus throughout serial 
25 passages by quantitative alkaline phosphatase assays. 

293 cells were cotransfected with pAd5pdvs and hpAP DNA as follows. Low passage 
293 cells (Microbix Biosvstems) were grown and passaged as suggested by the supplier. Five 
pAd5pdys and hpAP DNA (5 and 0.5 pg, respectively) were dissolved in 70 pi of 20 mM 
HEPES buffer (pH 7.4) and incubated with 30 pi of DOTAP (BMB) for 15 min. at room 
30 temperature. This mixture was resuspended in 2 mis of DMEM supplemented with 2% fetal 
calf serum (FCS) and added dropwise to a 60 mm plate of 293 cells at 80% eonfluencv. Four 
hours post-transfection the media was replaced by DMEM with 10% FCS. Cvtopathic effect 
was observed 6- 1 2 days post-transfection. 
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Cotransfection of 293 ceils with supercoiied pAdSpdys and linear hpAP DNA 
produced AdSpdys EAMs (the encapsidated version of pAd5pdys) approximately 6 to 12 days 
post transfection, as evidenced by the appearance of a cvtopathic effect (CPE). Initially, the 
amount of AdSPdvs EAMs produced was significantly lower than that of hpAP virions so the 
viral suspension was used to re-infect fresh cultures, from which virus was isolated and used 
for serial infection of additional cultures (Fig. 1 1). Infection and serial passaging of 293 cells 
was carried out as follows. 

Lysate from one 60 mm plate ot transfected 293 cells was prepared by vigorously 
washing the cells from the plate and centrifuging at 1 K rpm in a clinical centrifuge. Cells 
were resuspended in DMEM and 2% FCS. freeze-thawed in a dry ice-ethanol bath, cell debris 
removed by centrifugation, and approximately 75% of the crude lysate was used to infect 293 
cells in DMEM supplemented with 2% FCS for I hour and then supplemented with 10% FCS 
thereafter. Infection was allowed to proceed for 18-20 hrs before harvesting the virus. The 
total number of cells infected in each serial passage is indicated in Figure 11. 

In Figure 11. the total number of transducing adenovirus particles produced (output) 
per serial passage on 293 cells, total input virus of either the helper (hpAP) or AdSpdys. and 
the total number of cells used in each infection is presented. The total number of input/output 
transducing particles were determined by infection of 293 cells plated in 6-well microtiter 
plates. Twenty four hours post-infection the cells were assayed for alkaline phosphatase or p- 
galactosidase activity (described below) to determine the number of cells transduced with 
either Adipdvs or hpAP. The number of transducing particles were estimated by 
extrapolation of the mean calculated from 31 randomly chosen 2.5 mnr sectors of a 961 mm : 
plate. The intra-sector differences in total output of each type of virus are presented as the 
standard deviations, a. in Figure 11. For each serial passage. 75% of the total output virus 
from the previous passage was used for infection. 

Alkaline phosphatase or p-galactosidase activity was determined as follows. For 
detection of alkaline phosphatase, infected 293 cells on Petri dishes were rinsed twice with 
phosphate buffered saline (PBS) and fixed for 10 minutes in 0.5% glutaraldehyde in PBS. 

Cells were again rinsed twice with PBS for ten minutes followed by inactivation of 
endogenous alkaline phosphatase activity at 65°C for 1 hr. in PBS prior to the addition of the 
chromogenic substrate BCIP (5-bromo-4-chloro-3-indolyl phosphate) at 0.15 mg/ml and nitro 
blue tetrazolium at 0.3 mg/ml) . Cells were incubated at 37°C in darkness for 3- 24 hrs. For 
P-galactosidase assays, the cells were fixed and washed as above, then assayed using standard 
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methods [MacGregor et al. ( 1991) In Murray fed.). Methods in Molecular Biolog v. P’ol. 

Gene Transfer and Expression Protocols. Human Press Inc.. Clifton. NJ. pp. 217-225]. 

As shown in Figure 1 1. the rate of increase in titer of Ad5(3dys EAMs between 
transfection and serial passage 6 was approximately 100 times greater than that for hpAP 
5 virions. This result indicated that Ad5pdys has a replication advantage over the helper virus, 
probably due to the shorter genome length (a difference of approximately 8 kb) and hence an 
increased rate of packaging. 

Interestingly, after serial passage 6 there was a rapid decrease in the total titer of hpAP 
virions whereas the titer of Ad5pdys EAMs continued to rise. While not limiting the present 
10 invention to any particular mechanism, at least two possible mechanisms could be responsible 
for this observation. Firstly, a buildup of defective hpAP virions due to infections at high 
multiplicities may slowly out-compete their full length counterparts, a phenomenon that has 
been previously observed upon serial propagation of adenovirus [Daniell (1976) J. Virol. 
19:685: Rosenwirth et al. (1974) Virology 60:431: and Burlingham et al. (1974) Virology 
15 60:419]. Secondly, the emergence of replication competent virions due to recombination 

events between El sequences in cellular DNA and the hpAP genome could lead to a buildup 
of virus particles defective in expressing alkaline phosphatase [Lochmuller et al. (1994) Hum. 
Gene Ther. 5:1485]. 

Southern analysis of DNA prepared from serial lysates 3. 6. 9 and 12 indicated that 
20 full length dystrophin sequences were present in each of these lysates (Fig. 12A). In addition, 
the correct size restriction fragments were detected using both dystrophin and P-galactosidase 
probes against lysate DNA digested with several enzymes (Figs. 12A-B). 

Figure 12 shows a Southern blot analysis of viral DNA from lysates 3. 6. 9 and 12. 
digested with the restriction enzymes SawHII. Nrul and EcoRV. indicating the presence of a 
25 full length dystrophin cDNA in all lysates. Fragments from the C terminus of mus musculus 
dystrophin cDNA (A) or the N terminus of E coli P-galactosidase (B) were labeled with 
dCTP'" and used as probes [Sambrook et al., supra]. The position of these probes and the 
predicted fragments for each digest is indicated in Figure 10. Note that one end of each 
fragment (except the 17.8 kb BssEfll dystrophin fragment) detected is derived from the end of 
30 the linearized AdSpdvs genome (see Fig. 10). Low levels of shorter products, presumably 
derived from defective virions, become detectable only at high serial passage number. 

At the later passages (9 and 12) there appeared to be an emergence of truncated 
Ad5Pdys sequences, suggesting that deletions and/or rearrangements may be occurring at later 
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passages. Hence, most experiments were performed with Ad5Pdvs EAMs derived from the 
earlier passages. The possibility of the emergence of replication competent (El-containina) 
viruses was also examined by infection of HeLa cells by purified and crude serial lysates, 
none of which produced any detectable CPE. 

* b ) Purification Of Encapsidated Adenovirus Minichromosomes 

On CsCI Gradients 

The ability to separate Ad5pdvs EAMs from hpAP virions based on their buoyancy 
difference (due presumably to their different genome lengths) on CsCI gradients was 
- examined. Repeated fractionation of the viral lysate allows small differences in buoyancy to 
be resolved. CsCI purification of encapsidated adenovirus minichromosomes was performed 
■ as follows. Approximately 25% of the lysate prepared from various passages during serial 
infections was used to purity virions. Freeze-thawed lysate was centrifuged to remove the 
cell debris. The cleared lysate was extracted twice with 1.1.2 tricholorotrifluoroethane 
(Sigma) and applied to CsCI step and self forming gradients. 

Purification of virus was initially achieved by passing it twice through CsCI step 
gradients with densities of p=1.45 and p=1.20 in a SW28 rotor (Beckman). After isolation of 
the major band in the lower gradient, the virus was passed through a self forming gradient 
(initial p= 1.334) at 37.000 rpm for 24 hrs followed by a relaxation of the gradient by 
reducing the speed to 10.000 rpm for 10 hrs. in a SW41 rotor (Beckman) at 12°C [Anet and 
Strayer (1969) Biochem. Biophvs. Res. Commun. 34:328]. The upper band from the gradient 
(composed mainly of Ad5(idys virions) was isolated using an 1 8 gauge needle, reloaded on a 
fourth CsCI gradient (p= 1.334) and purified at 37.000 rpm for 24 hrs followed by 10.000 rpm 
,, for 1 0 hrs at 1 2°C. 

"3 The Ad5 Pdvs-containing CsCI band was removed in 100 pi fractions from the top of 

the centrifugation tube and CsCI was removed by chromatography on Sephadex G-50 
(Pharmacia). Aliquots from each fraction were used to infect 293 cells followed by P- 
yalactosidase and alkaline phosphatase assays to quantitate the level of contamination by 
. hpAP virions in the final viral isolate. 

Results of the physical separation between AdSpdys EAMs and hpAP virions are 
shown in Figure 13. Figure 13 shows the physical separation of AdSpdys from hpAP virions 
at the third (A) and final (B) stages of CsCI purification (initial p=1.334) in a SW41 tube. 
Aliquots of AdSfidys EAMs from the final stage were drawn through the top of the 
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centrifugation tube and assayed for P-galactosidase and alkaline phosphatase expression. 

Results of these assays are presented in Figure 14. 

In Figure 14. shows the level of contamination of AdSPdys EAMs by hpAP virions 
obtained from passage 6. Following four cycles of CsCl purification, aliquots were removed 
from the top of the centrifugation tube and used for infection of 293 cells, which were fixed 
24 hrs later and assayed for P-galactosidase or alkaline phosphatase activity (described below). 
The number of transducing particles are an underestimation of the actual totals as in some 
cases a positive cell may have been infected by more than one transducing particle. The ratio 
of the two types of virions - AdSPdys EAMs (LacZ) or hpAP (AP) in each fraction is 
indicated in the lower graph. 

The maximum ratio of transducing AdSpdvs EAMs to hpAP virions reproduciblv 
achieved in this study was 24.8 — a contamination with helper virus corresponding to 
approximately 4% of the final viral isolate. 



15 c) Dystrophin And /3-galactosidase Expression By Encapsulated 

Adenovirus Minichromosomes In Muscle Cells 

To determine if the Ad5pdvs EAMs were able to express P-galactosidase and 
dystrophin in muscle cells, mouse mdx myogenic cultures were infected with CsCl purified 
EAMs. 

20 

i) Propagation And Infection Of Muscle Cells 
MM 14 and mdx myogenic cell lines were kindly provided by S. Hauschka (University 
of Washington) and were cultured as previously described [Linkhart ei al. (1981) Dev. Biol. 
86:19 and Clegg et al. (1987) J. Cell Biol. 105:949], Briefly, myoblasts were grown on 
25 plastic tissue culture plates coated with a 0.1% gelatin in Ham’s F-10 medium containing 15% 
(v/v) horse serum. 0.8 mM CaCU. 200 ng/ml recombinant human basic fibroblast growh 
factor (b-FGF) and 60 pg/ml genitimicin (proliferation medium). Cutures were induced to 
differentiate by switching to growth in the presence of growth medium lacking b-FGF and 
containing 10% horse serum (differentiation medium). Myoblasts or differentiated myotubes 
30 (three days post switching) were infected at a multiplicity of infection of 2.2 Ad5pdys EAMs 
per cell. Fractions containing minimal contamination with hpAP virions (3. 4 and 5 of 
passage 6) were used for western and immunofluorescence analysis. Infection was allowed to 
proceed for 3 days for both the myoblasts and myotubes before harvesting cells. 
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ii) Total Protein Extraction And Immunoblot 
Analysis 

For protein extraction, musde cells were briefly trypsinized. transferred to a 
microcentrifuge tube, centrifuged at 14 K for 3 min at room temp and resuspended two times 
in PBS. After an additional centrifugation, the cell pellet was resuspended in 80 ul of RIPA 
.buffer (50 mM Tris-Cl. pH 7.5; 150 mM NaCl; 1% Nonidet P-40: 0.5% sodium 
deoxvchoiate: 0.1% SDS) (Sambrook et al.. supra). The sample was briefly sheared usine a 
22 gauge needle to reduce viscosity and total protein concentration assaved usine the 
bicinconinic acid protein assay reagent (Pierce. Rockford. IL). Expression of full length 
dystrophin or P-gaiactosidase in infected mdx and MM 14 myoblasts or myotubes was 
analyzed by electrophoresis of 40 pg of total protein extract on a 6% SDS-PAGE gel (in 25 
mM Tris. 192 mM glycine. 10 mM p-mercaptoethanol. 0.1% SDS). After transferring to 
Gelman Biotrace NT membrane (in 25 mM Tris. 192 mM glycine. 10 mM p-mercaptoethanol. 
0.05% SDS. 20% methanol), the membrane was blocked with 5% non-fat milk and 1% goat- 
serum in Tris-buffered saline-Tween (TBS-T) for 12 hrs at 4°C. Immunostaining was done 
according to the protocol for the ECL western blotting detection reagents (Amersham Life 
Sciences. Buckingham. UK). The primary antibodies used were Dvs-2 (Vector Laboratories) 
and anti-P-galactosidase (BMB. Indianapolis. IN) with a horseradish peroxidase-conjugated 
anti-mouse secondary antibody. 

Western blot analysis of EAM-infected mdx myoblasts and myotubes (three days post- 
lusion) indicated that EAMs were able to infect both of these cell types (Figure 15). In 
Figure 15. immunoblots of protein extracts from mdx myoblasts and myotubes demonstrating 
the expression of P-galactosidase (A) and dystrophin (B) in cells infected with Ad5pdys 
EAMs. Total protein was extracted 3 days post infection in all cases. Myotubes were 
mfected at three days following a switch to differentiation media. In Figure 15A. lane 1 
contains total protein extract from 293 cells infected with a virus expressing P-galactosidase 
as a control (RSV-LacZ); lanes 2-5 contain total protein extracts from uninfected mdx 
myoblasts, mdx myoblasts infected with Ad5pdys EAMs, mdx myotubes and mdx myotubes 
derived from mdx myoblasts infected with Ad5pdvs EAMs. respectively. In Figure 15B. 
lanes 1 and 7 contain total protein from mouse muscle ("C57") while lane 2 contains protein 
from wild type MM 14 myotubes. as controls. Lanes 3-5 contain total protein extracts from 
uninfected mdx myoblasts, mdx myoblasts infected with Ad5pdys EAMs. mdx myotubes and 
mdx myotubes derived from mdx myoblasts infected with Ad5pdvs EAMs. respectively. 
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As shown in Figure 15. expression of P-galactosidase was detected in both the infected 
nu/x myoblasts and myotubes indicating that the CMV promoter was active at both these earlv 
stages of differentiation in muscle cells. However, only infected mdx myotubes produced 
protein detected by dys-2. an antibody recognizing the 17 C-terminal amino acids of 
dystrophin (Fig. 15). No dystrophin expression was detected in infected myoblasts by western 
analysis, indicating that the muscle creatine kinase promoter functions minimally, if at all. 



within the Ad5pdys EAM prior to terminal differentiation of these cells. 

Dystrophin expression in mdx cells infected with EAMs was confirmed by 
immunofluorescence studies using N-terminal dystrophin antibodies. In agreement with the 
western analysis, dystrophin expression from the MCK promoter was detected only in 
differentiated mdx myotubes infected by Ad5pdvs EAMs (Fig. 16). Figures 16A-C show 
immunofluorescence of dystrophin in wild type MM14 myotubes (A), uninfected mdx (B) and 
infected mdx myotubes (C). respectively. The results shown in Figure 16 demonstrate the 
transfer and expression of recombinant dystrophin to differentiated mdx cells by AdSpdys 
EAMs. 



Immunofluorescence of myogenic cells was performed as follows. Approximately 1.5 
x 10 h MM 14 or mdx myoblasts w'ere plated on Poly-L-lvsine (Sigma) coated glass slides (7 x 
3 cm) which had been previously etched with a 0.05% chromium potassium sulfate and 0.1% 
gelatin solution. For myotube analysis, the cultures were switched to differentiation media 
[Clegg at ul (1987). supra) 48 hours after plating, immediately infected and then allowed to 
fuse for 3 days, whereas myoblasts were continuously propagated in proliferation media 
[Clegg at al. (1987). supra). Cells were washed three times with PBS at room temperature 
and fixed in 3.7% formaldehyde. For immunostaining. cells were incubated in 0.5% Triton 
X-100. blocked with 1% normal goat serum and incubated with an affinity purified antibodv 
against the N-terminus of murine dystrophin for 2 hrs. followed by extensive washing in PBS 
and 0.1% Tween-20 with gentle shaking. Cells were incubated with a 1:200 dilution of biotin 
conjugated anti-rabbit antibody (Pierce) for one hour and washed as above. Cells were 
lurther incubated with a 1:300 dilution of streptavidin-fluorescein isothiocvanate conjugate 
(Vectorlabs. Burlingam. CA) for one hour and washed as above, followed by extensive 
washing in PBS. 

The above results show that embedded inverted Ad origins of replication coupled to an 
encapsidation signal can convert circular DNA molecules to linear forms in the presence of 
helper virus and that these genomes can be efficiently encapsidated and propagated to high 
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titers. Such viruses can be purified on a CsCl gradient and maintain their ability to transduce 
cells in vino, and their increased cloning capacity allows the inclusion of large genes and 
tissue specific gene regulatory elements. The above results also show that the dystrophin gene 
was expressed in cells transduced by such viruses and that the protein product was correctly 
localized to the cell membrane. The above method for preparing EAMs theoretically enables 
virtually any gene of interest to be inserted into an infectious minichromosome by 
conventional cloning in plasmid vectors, followed by cotransfection with helper viral DNA in 
293 cells. This approach is useful for a variety of gene transfer studies in vitro. The 
observation that vectors completely lacking viral genes can be used to transfer a full-leneth 
dystrophin cDNA into myogenic cells indicates that this method may be used for the 
treatment of DMD using gene therapy. 

d) A Modified MCK Enhancer Increases Expression Of Linked 
Genes In Muscle 

The DNA fragment containing enhancer/promoter of the MCK gene utilized in the 
AdSpdys EAM plasmid is quite large (—3.3. kb). In order to provide a smaller DNA fragment 
capable of directing high levels of expression of linked genes in muscle cells, portions of the 
3.3 kb MCK enhancer/promoter were deleted and/or modified and inserted in front of a 
reporter gene (lacZ). The enhancer element of the MCK gene was modified to produce the 
2RS:> enhancer: the sequence of the 2RS5 enhancer is provided in SEQ ID NO: 10. The first 
6 residues ot SEQ ID NO: 10 represent a Kpnl site added for ease of manipulation of the 
modified MCK enhancer element. Residue number 7 of SEQ ID NO: 10 corresponds to 
residue number 2164 ot the wild-type MCK enhancer sequence listed in SEQ 1 ID NO:9 
(position 2164 of SEQ ID NO:9 corresponds to position -1256 of the MCK gene). Residue 
number 1 7$. ot SEQ ID NO: 10 corresponds to residue number 2266 of the wild-type MCK 
enhancer sequence listed in SEQ ID NO:9. 

These MCK/lacZ constructs were used to transfect cells in culture (/.e.. myogenic 
cultures) or were injected as naked DNA into the muscle of mice and p-galactosidase activity 
was measured. Figure 17 provides a schematic of the MCK/lacZ constructs tested. The first 
construct shown in Figure 17 contains the 3.3. kb wild-type MCK enhancer/promoter 
fragment linked to the E. coli lacZ gene. The wild-tvpe enhancer element (-1256 to -1056) 
is depicted by the box containing "E": the core promoter element (-358 to -80) is indicated by 
the light cross-hatching and the minimal promoter element (-80 to +7) is indicated bv the dark 
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cross-hatching. The core promoter element is required in addition to the minimal promoter 
element (which is required for basal expression) in order to achieve increased muscle-specific 
expression in tissue culture [Shield et al. (1996) Mol. Cell. Biol. 16:5058]. The modified 
enhancer element, the 2RS5 enhancer, is indicated by the box labeled "E*." The box labeled 
5 "mi nx" contains a synthetic intron derived from adenovirus and is used to increase expression 
of the constructs (any intron may be utilized for this purpose). The box labeled "CMV" 
depicts the CMV IE enhancer/promoter which was used a positive control. 

In figure 17. (3-galactosidase activity is expressed relative to the activity of the wild- 
type MCK enhancer/promoter construct shown at the top of the figure. As shown in Figure 
10 17. a construct containing the 2RS5 enhancer (SEQ ID NO: 10) linked to either the minimal 

MCK promoter (a — 261 bp element) or the core and minimal MCK promoter elements (a -539 
bp element) directs higher levels of expression of the reporter gene in muscle cells as 
compared to the -3.3 kb fragment containing the wild-tvpe enhancer element. These modified 
enhancer/promoter elements are considerably smaller than the -3.3 kb fragment used in the 
15 Ad5pdys EAM plasmid and are useful for directing the expression of foreign genes in muscle 
cells. These smaller elements are particularly useful for driving the expression of genes in the 
context of self-propagating adenoviral vectors which have more severe constraints on the 
amount of foreign DNA which can be inserted in comparison to the used of "gutted" 
adenoviruses such as the EAMs described above. 

20 

EXAMPLE 7 

Generation Of High Titer Stocks Of Encapsidated Adenovirus 
Minichromosomes Containing Minimal Helper Virus Contamination 

25 The results presented in Example 6 demonstrated that encapsidated adenovirus 

minichromosomes (EAMs) can be prepared that lack all viral genes and which can express 
full-length dystrophin cDNAs in a muscle specific manner. The propagation of these EAMs 
requires the presence of helper adenoviruses that contaminate the final EAM preparation with 
conventional adenoviruses (about 4% of the total preparation). In this example the EAM 
30 system is modified to enable the generation of high titer stocks of EAMs with minimal helper 
virus contamination. Preferably the EAM stocks contain helper virus representing less than 
1%. preferably less than 0.1% and most preferably less than 0.01% of the Final viral isolate. 
Purified EAMs are then injected in vivo in muscles of dystrophin minus mdx mice to 
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determine whether these vectors lead to immune rejection and whether they can alleviate 
dystrophic symptoms in mdx muscles. 



The amount of helper virus present in the EAM preparations is reduced in two ways. 
The first is by selectively controlling the relative packaging efficiency of the helper virus 
versus the EAM virus. The second is to improve physical methods for separating EAM from 
helper virus. These approaches enable the generation of dystrophin-expressing EAMs that are 
contaminated with minimal levels of helper virus. 



a) Development And Characterization Of Adenovirus Packaging 
Cell Lines Expressing The Cre Recombinase From 
Bacteriophage PI 

Cell lines expressing a range of Cre levels are used to optimize the amount of helper 
virus packaging that occurs during growth of the EAM vectors. The Cre-loxP system [Sauer 
and Henderson (1988) Proc. Natl. Acad. Sci. USA 85:5166] is employed to selectively disable 
helper v irus packaging during growth of EAMs. The bacterial Cre recombinase catalyzes 
efficient recombination between a 34 bp target sequence called loxP. To delete a desired 
sequence. loxP sites are placed at each end of the sequence to be deleted in the same 

orientation: in the presence of Cre recombinase the intervening DNA segment is efficiently 
excised. 

Cell lines expressing a range of Cre recombinase levels are generated. The expression 
of too little Cre protein may result in high levels of helper virus being generated, which leads 
to unacceptably high levels of helper virus contaminating the final EAM preparation. If very- 
high lev els of Cre expression are present in a cell line, excision of the packaging signal from 
the helper virus would be 100% efficient (/.*., it would completely prevent helper virus 
packaging). As shown in Example 6. serial passage of EAM preparations containing low 
levels of helper virus increased the titer of the EAM. Therefore, it is desirable, at least in the 
initial passages of a serial passage that some helper virus capable of being packaged is 
present. A low level of packagable helper virus may be provided by using a ceil line 
expressing levels of Cre recombinase which are not high enough to achieve excision of the 
packaging signals from 100% of the helper virus: these cell lines would be used early in the 
serial passaging ot the EAM stock and a cell line expressing high enough levels of Cre 
recombinanse to completely prevent helper virus packaging would be used for the final 
passage. 
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Aliernativeiv. EAMs may be prepared using a packaging cell line that supports high 
efficiency C re recombinase-mediated excision of the packaging signals from the helper virus 
by transfection of the C/e-expressing cell line with the EAM plasmid followed by infection of 
these cells with /o.rP-contaning helper virus using an MOI of > 1.0. 

5 Human 293 cell lines that express a variety of levels of Cre recombinase are generated 

as follows. An expression vector containing the Cre coding region. pOG231. was 
cotransfected into 293 cells along with pcDNA3 (Invitrogen); pcDNA3 contains the neo gene 
under the control of the SV40 promoter. pOG23 1 uses the human CMV enhancer/promoter 
[derived from CDM8 (Invitrogen)] to express a modified Cre gene (pOG23 1 was obtained 
10 from S. O'Gorman]. The modified Cre gene contained within pOG231 has had a nuclear 

localization signal inserted into the coding region to increase the efficiency of recombination. 

pOG231 was constructed as follows. A Bglll site was introduced into the 5' Xbal site 
of the synthethic intron of pMLSISCAT [Haung and Gorman (1990) Nucleic Acids Res. 
18:937] by linker tailing. A Bglll site in the synthethic intron of pMLSISCAT was destroyed 
15 and a BamHl linker was inserted into the Pstl site at the 3' end of the synthethic intron in 

pMLSISCAT. Sg/II and Smal sites and a nuclear localization signal were introduced into the 
5 end ot pMC-Cre [Gu et al. (1993) Cell 73:1 155] using a PCR fragment that extended from 
the novel ff,g/II and Smal sites to the BamHl site in the Cre coding region. This PCR 
fragment was ligated to a BamHIISall fragment containing a portion of the Cre coding region 
20 derived from pIC-Cre (Gu et al.. supra) and the intron plus Cre coding sequence was inserted 
into a modified form of pOG44 [O’Gorman et al. (1991) Science 251:1351] to generate 
pOG231. The predicted sequence of pOG231 from the Bglll site to the BamHl site located in 
the middle of the Cre coding sequence is listed in SEQ ID NO:l L. 

One 60 mm dish of 293 cells (Microbix) were transfected with 10 pg of fVzdl- 
25 linearized pOG231 and 1 pg of /Vo/I-linearized pcDNA3 using a standard calcium phosphate 
precipitation protocol. Two days after the addition of DNA. the transfected cells were split 
into three 100 mm dishes and 1000 pg/ml of active G418 was added to the medium. The 
cells were fed periodically with G41 8-contaning medium and three weeks later. 24 G418- 
resistant clones were isolated. 

30 The isolated clones were expanded for testing. Aliquots were frozen in liquid nitrogen 

at the earliest possible passage. The neomycin resistant cell lines were examined for the 
expression of Cre recombinase using following transfection assay. The neomycin resistant 
cells were transfected with PGK.-1-GFP-lacZ (obtained from Sally Camper. Univ. of 
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Michigan. Ann Arbor. Ml), which contains a green fluorescent protein (GFP) expression 
cassette that can be excised by Cre recombinase to allow expression of (3-galactosidase: 
transfection was accomplised using standard calcium phosphate precipitation. Figure 18 
provides a schematic of a GFP/p-gal reporter construct suitable for assaying the expression of 
Cre recombinase in mammalian cells: GFP sequences and p-gal (i.e.. lacZ) sequences are 



avaialbale from commmerical sources (e.g.. Clonetech. Palo Alto. CA and Pharmacia Biotech. 



Piscataway, NJ, respectively). 



Control experiments verified that 293 cells transfected with PGK- 1 -GFP-lacZ 
expressed significant amounts of p-galactosidase only if these cells also expressed Cre 
recombinanse. P-galactosidase assays were performed as described in Example 6. Neomycin 
resistant cells expressing Cre recombinase were grouped as high, medium or low expressors 
based upon the amount of p-galactosidase activity produced (estimated by direct counting of 
p-galactosidase-positive cells per high-power field and by observing the level of staining and 
the rapidity with which the blue stain was apparent) when these cell lines were transfected 
with the GFP/p-gal reporter construct. Thirteen positive (i.e.. Cre-ex pressing) lines. D608#12. 
#7. #22. 18. #\1. #4, #8. #2. #2/2, #13. #5. #\5 and #21 were retained for further use. 

The results of this transfection analysis revealed that cultures of 293 cells expressing 
medium to high levels of Cre recombinase could be generated without apparent toxicity. 



b) Generation Of Helper Adenovirus Strains That Contain loxP 
Sites Flanking The Adenovirus Packaging Signals 
Studies of EAM production demonstrated that the EAM vector has a packaging 
advantage over the helper adenovirus (Ex. 6). While not limiting the present invention to a 
particular mechanism, it is hypothesized that this packaging and replication advantage can be 
greatly increased by using helper viruses that approach the packaging size limits of Ad5 [Bett 
ei ul. (1993) J. Virol. 67:591 1], by using viruses with mutations in E4 and/or E2 genes [Yeh 
el al. (1996) J. Virol. 70:559: Gorziglia et al. (1996) J. Virol. 70:4173: and Amalfitano et al. 
(1996) Proc. Natl. Acad. Sci. USA 93:3352]. by inclusion of mutations or alterations in the 
packaging signals of the helper virus [Imler et al. (1995) Hum. Gene Ther. 6:711] and by 
combining these strategies. 

The Cte-loxP excision method is used to disable the packaging signals from the helper 
virus genomes. The Ad5 packaging domain extends from nucleotide 194 to 358 and is 
composed of five distinct elements that are functionally redundant [Hearine et al. (1987) J. 
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Virol. 61:2555]. Theoretically, any molecule containing the Ad5 origin of replication and 
packaging elements should replicate and be packaged into mature virions in the presence of 
non-defective helper virus. Disabling the packaging signals should allow replication and gene 
expression to proceed, but will prevent packaging of viral DNA into infectious particles. This 
5 in turn should allow the ratio of EAM to helper virus to be increased greatly. 

To disable the packaging signals within the helper virus used to encapsidate the EAMs. 
the loxP sequences are incorporated into a helper virus that has a genome approaching the 
maximal packaging size for Ad [-105 map units; Bett et aL (1993). supra] which further 
decreases the efficiency of helper virus packaging. Final virus size can be adjusted by the 
10 choice of introns inserted into the reporter gene, by the choice of reporter genes, or by 
including a variety of DNA fragments of various sizes to act as "stuffier" fragments. A 
convenient reporter gene is the alkaline phosphatase gene (see Ex. 6). The optimized Cre - 
loxP system is also incorporated into a helper viral backbone containing disruptions of any or 
all of the E1-E4 genes. Use of such deleted genomes requires viral growth on appropriate 
15 complementing cell lines, such as 293 cells expressing E2. and/or E4 gene products. The 
loxP sequences are incorporated into the helper virus by placing the loxP 
sequences on either side of the packaging signals on a shuttle vector. This modified shuttle 
vector is then used to recombine with the Ad DNA derived from a virus containing 
disruptions of any or all of the E1-E4 genes to produce the desired helper virus containing the 
20 packaging signals flanked by loxP sequences. 

pAdBglll. an adenovirus shuttle plasmid containing Ad sequences from 0-1 and 9-16 
map units (mu) was used as the starting material. Synthetic oligonucleotides were used to 
create a polylinker which was inserted at 1 mu within pAdBglll as follows. The Bglll LoxP 
oligo [5*-GAAGATCTATAACTTCGTATAATGTATGCTA 
25 TACGAAGTTATTACCGAAGAAATGGCTCGAGATCTTC-3* (SEQ ID NO: 1 2) and its 
reverese complement [5V-GAAGATCTCGAGCCATTTCTTCGGTAATAACTTCGT 
AT AGC AT AC ATT AT ACG A AGTT AT AG AT CTTC-3 ’ (SEQ ID NO: 13)] were synthesized. 
The Afllll LoxP oligo [5 ? -CC AC AT GT AT A ACTTCGT AT AGC AT AC A 
TTATACGAAGTTATACATGTGG-3' (SEQ ID NO: 14)] and its reverse complement [5*- 
30 CCACATGTATAACTTCGTATAATGTATGCTATACGAAGTTATACATG TGG-3' (SEQ 

ID NO: 1 5)] were synthesized. The double stranded form of each loxP oligonuceotide was 
digested with the appropriate restriction enzyme (e.g., BglU or AjlUY) and inserted into 
pAdBglll which had been digested with BglU and AflWl. This resulted in the insertion of 
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LoxP sequences into the shuttle vector Hanking the packaging signals that are located between 
0.5 and I mu of Ad5 (one mu equals 360 bp in the Ad5 genome: the sequence of the Ad5 
genome is Isited in SEQ ID NO:l). The 3' loxP sequence was inserted into the BgfU site 

within pAdBglll. The 5' loxP sequence was inserted into the T//III site located at base 143 
(-0.8 mu) (numbering relative to Ad5). 

:: DNA sequencing was used to verify the final structure of the modified shuttle plasmid 

between 0-1 mu. If Cre recombinase-mediated excision of the packaging signals is found to 
be too efficient (as judged by the production of too little helper virus), alternate sites for the 
insertion of the loxP sequences are used that would result in deletion of 2. 3 or 4 packaging 

“ se< l uences ra,her 'han all 5 [Grable and Hearing (1990) J. Virol. 64:2047], The insertion of 
loxP sequences at sites along the Ad genome contained within the shuttle vector which would 
result in the deletion 2. 3 or 4 packaging sequences are easily made using the technique of 



recombinant PCR [Higuchi (1990) In: PCR Protocols: A Guide to Methods and 
Applications. Inms et al. (eds.) Academic Press. San Diego. CA. pp. 177-183]. The optimal 
amount of Or recombinase-mediated excision of the packaging signals is that amount which 
permits the production of enough packaged helper virus to permit the slow spread of virus on 
the first plate of cells co-transfected with helper virus DNA (containing the loxP sequences) 
and the EAM vector. This permits serial passage of the EAM preparation onto a subsequent 
lawn of cells to increase the titer of the EAM preparation. Alternatively, if Cre recombinase- 
mediated excision of the packaging signals is essentially 100% efficient in all Cre-expressing 
cells lines U.e.. regardless of the level of Cre-expression. that is even a cell line expressing a 
low level of Cre as judged by the CFP/p-gal assay described above), the packaged EAMs 



"~ may be used alon » vvith he, P er virus (used at a MOI of -1.0) to infect the second or 
subsequent lawn of cells to permit serial passaging to increase the titer of the EAM 
■X. preparation. 

Following introduction of the loxP sequences into the shuttle vector, the human 
placental alkaline phosphatase (HpAp) cDNA under control of the RSV promoter was inserted 
into the polvlinker to provide a reporter gene for the helper virus. This is the same reporter 
used previously during EAM generation (Ex. 6). The HpAp sequences were inserted as 
tollows. The /orR-containing shuttle vector was linearized with Xho\ and the hpAp cassette 
was ligated into the Xho\ site (a Xho\ site was inserted into pAdBglll during the insertion of 
the loxP sequences as a A7wl site was located on the 3‘ end of the loxP sequences inserted 
into the DgtW site ot pAdBglll). The HpAp cassette was constructed as follows. 
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pRSVhAPT40 (obtained from Gary NabeL Univ. of Michigan. Ann Arbor. MI) was digested 
with EcoRl to generate an £coRJ fragment containing the HpAP cDNA and the SV40 intron 
and polyadenvlation sequences. pRc/RSV (Invitrogen) was digested with ///mllll. then 
partially digested with £c*oRJ. A 5.208 bp fragment was then size selected on an agarose gel. 

5 treated with calf alkaline phosphatase, and ligated to the £coRI fragment derived from 

pRSVhAPT40 to generate pRc/RSV AP. pRc/RSVAP was then digested with Sal I and Xho\ to 
liberate the RSV promoter linked to the HpAP cDNA cassette (including the SV40 intron and 
polyadenvlation sequences). This Sal\-Xho\ fragment was inserted into the /avP-containing 
shuttle vector which had been digested with Sail and Xho\ to generate pADLoxP-RSVAP. 

10 Helper virus containing /orP sites flanking the Ad packaging signals is generated by 

co-transfection of the LaxP shuttle plasmid (pADLoxP-RSVAP) and C/al-digested 
Ad5<7/7001 DNA into 293 cells [Graham and Prevec (1991) Manipulation of Adenovirus 
Vectors. In Methods in Molecular Biology, Voi Gene Transfer and Expression Protocols . 
Murray (ed.). Humana Press Inc.. Clifton. NJ, pp. 109-128]. Figure 19 provides a schematic 
15 showing the recombination event between the loxP shuttle vector and the Ad5c//700l genome. 
Co-transfection is carried out as described in Example 3c. 

Alternatively, the reporter gene may be inserted into the E3 region of pBHGlO or 
pBHGl 1 (using the unique Pacl site) rather than into the poly I inker located in the El region 
of the shuttle vector. The reporter gene-containing pBHGlO or 1 1 is then used in place of 
20 Ad5t//7001 for cotransfection of 293 cells along with the loxP-containing pAdBglll. 

Following cotransfection, recombinant plaques are picked, plaque purified, and tested 
for incorporation of both hpAp and loxP sequences by PCR and Southern analysis (Ex. 3). 
Viruses which contain loxP sites flanking the packaging signals and the marker gene (hpAp) 
are retained, propagated and purified. 

25 The isolated helper virus containing loxP sites flanking the packaging signals and the 

marker gene is then used to infect both 293 cells and the 293 cell lines that express Cre 
(section a. above). Cre recombinase-expressing cell lines that produce optimal levels of Cre 
recombinase when infected with the /av£-containing helper virus are then used for the 
generation of EAMs as described below. 

30 
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Generation Of Encapsidated Adenovirus Minichromosomes 
That Express Dystrophin 



The growth of EAMs is optimized using the following methods. In the first method, 
plasmid DNA from the EAM vector (Ex. 6) are co-transfected into 293 cells with purified 
viral DNA from the helper virus, and viruses are harvested 10-14 days later after appearance 
of a viral cvtopathic effect (CPE) as described in Example 6. This approach is simple, yet 
potentially increases helper virus levels by allowing viral spread throughout the culture dishes. 

In the second method, the co-transfected 293 cells are overlaid with agar and single 
plaques are picked and tested for EAM activity by the ability to express p-galactosidase 
activity following infection of Hela cells (Ex. 6). This second approach is more time 
consuming, but will result in less contamination by helper. 

The ability of these two methods to generate EAMs are directly compared. The 
preferred method is that which produces the highest titer of EAM with the lowest 
contamination ot helper virus. The efficiency of EAM generation may also be increased by 
using helper virus DNA that retains the terminal protein (TP) (i.e.. the helper virus is used as 
a viral DNA-TPC as described in Ex. 3) The use of viral DNA-TPC has been shown to 
increase the efficiency of viral production following transfection of 293 cells bv an order of 
magnitude [Miyake et al. (1996). supra}. 

The initial transfection will utilize 293 cells that do not express Cre recombinase. so 
that efficient spread of the helper can lead to large scale production of EAM. The method 
producing the highest ratio of EAM to helper virus is then employed to optimize conditions 
for the serial propagation of the EAM on 293 cells expressing Cre recombinase. This 
.optimization is conducted using cell lines expressing different levels of Cre recombinase. The 
following variables are tested: 1) the ratio of input viral titer to cell number. 2) the number 
s; pf serial passages to use for EAM generation, 3) the use of cell lines producing different 
levels of Cre to achieve the optimal ratio between high EAM titer and low helper titer, 4) 
continuous growth on Oe-producing 293 cells versus alternating between Oe-producing cells 
i»and the parental 293 cells, or to alternate between high and low Oe-producing cells and 5) 
CsCl purification of EAMs prior to re-infection of 293 cells increases the ratio of the final 
EA M/helper titers. The protocols that result in the highest yield of EAM with minimal helper 
virus are used to generate large volumes of crude viral lysates for purification by density 
gradient centrifugation (Ex. 6). 
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EAMs were purified from helper virus using standard CsCl density gradient 
centrifugation protocols in Example 6 and resulted in preparations containing -4% helper virus 
In order to improve the physical separation of EAMs and helper virus, a variety of different 
centrifugation conditions are possible, including changing the gradient shape, type of rotor and 
5 tube used, combinations of step and continuous gradients, and the number of gradients used. 
Materials with better resolving powers than CsCt can also be employed. These include 
rubidium chloride and potassium bromide [Reich and Zarvbnicky (1979) Annal. Biochem. 
94:193], 

10 d) Use Of EAMs Encoding Dystrophin For Long Term 

Expression Of Dystrophin in Muscle 

To demonstrate the ability of the purified dystrophin expressing EAMs (prepared as 
described above) to deliver and express dystrophin in the muscle of an animal, the following 
experiment is conducted. First, purified dvstrophin-EAMs are delivered to the muscle by 
15 direct intramuscular injection into newborn, 1 month, and adult (3 month) mouse quadriceps 
muscle. The dystrophin expressing EAM Ad5(3Dys are injected into mcLx mice and into 
transgenic mdx mice that express P-galactosidase in the pituitary gland [Tripathy et al. (1996) 
Nature Med. 2:545]. These latter mice are used to avoid potential immune-rejection of cells 
expressing p-galactosidase from the Ad5pDvs vector. An alternate EAM lacking the P- 
20 galactosidase reporter gene may also be employed; however, the presence of the P-Gal 

reporter simplifies EAM growth and purification (vectors lacking p-Gal have their purity 
estimated by PCR assays rather than by P-Gal assays). 

Following intramuscular injection of EAM. animals are sacrificed at intervals between 
I week and 6 months to measure dystrophin expression [by western blot analysis and by 
25 immunofluorescence (Phelps et al. (1995) Hum. Mol. Genet. 4:1251 and Rafael el al. (1996) 

J. Cell Biol. 134:93] and muscle extracts will also be assayed for P-Gal activity [MacGregor 
et al. ( 1991). supra]. These results are compared with previous results obtained using current 
generation viral vectors (i.e.. containing deletions in El and E3 only) to demonstrate that 
EAMs improve the prospects for long term gene expression in muscle. 

30 
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Use Of Dystrophin-EAMs To Prevent, Halt, Or Reverse The 
Dystrophic Symptoms That Develop In The Muscles Of mdx 
Mice 



To demonstrate that beneficial effects on dystrophic muscle are achieved by delivery 
ot dystrophin expressing EAMs to mdx mice, the following experiments are conducted. 

Central nuclei counts are performed on soleus muscle at intervals following injection of EAM 
into newborn. 1 month, and 3 month (adult) mice (Phelps et al .. 1995). Central nuclei arise 
in mouse muscle only after a mvofiber has undergone dystrophic necrosis followed by 
regeneration, and is a quantitative measure of the degree of dystrophy that has occurred in a 
muscle group (Phelps et al .. 1995). 

More informative assays are also contemplated. These assays require administration of 
EAMs to mouse diaphragm muscles. 

The diaphragm is severely affected in mdx mice (Stedman et al.. 1991: Cox et al.. 
199j). and displays dramatic decreases in both force and power generation. Administration of 
virus to the diaphragm will allow the strength of the muscles to be measured at intervals 
following dystrophin delivery. The force and power generating assays developed by the 
Faulkner lab are used to measure the effect of dystrophin transgenes (Shraeer et al.. 1992: 
Rafael et al:. 1994: Lynch et al.. 1996). 

To detemine whether dystrophin delivery to dystrophic muscle reverses dvstrophy or 
stabilizes muscle, varying amounts of the dystrophin EAM are delivered to the to diaphragm 
at different stages of the dystrophic process and then strength is measured at intervals 
following EAM administration. First, various titers of EAM are tested to determine the 
minimal amount of virus needed to transduce the majority of muscle fibers in the diaphragm. 

It has been shown that conventional adenovirus vectors can transduce the majority of 
diaphragm fibers when 10 pfu are administrated by direct injection into the intraperitoneal 
cavity [Huard et al. (1995) Gene Therapy 2:107], In addition, it has been shown that 
transduction ot a simple majority of fibers in a muscle group is sufficient to prevent virtually 
all the dystrophic symptoms in mice [Rafael et al. (1994) Hum. Mol. Genet. 3:1725 and 
Phelps et al. (1995) Hum. Mol. Genet. 4:1251], Virus is administered to mdx animals at three 
different ages (neonatal. 1 month, and 3 months). Animals are sacrificed for phvsiological 
analysis ot diaphragm muscle at two different times post infection ( I month and 3 months). 
Error control is achieved by performing these experiments in sextuplicate. Control animals 
consist of mock injected wild-type (C57B1/10) and dystrophic (mdx) mice. Three month old 
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mdx mice display a 40% reduction in force and power generation compared with wild-type 
mice, while 6 month animals display a greater than 50% reduction (Cox at til. (1993) Nature 
364:725: Rafael at til. (1994). supra ; Phelps at cil. (1995). supra : and Corrado at ai (1996) J. 
Cell. Biol. 134:873]. 



EXAMPLE 8 

Improved Shuttle Vectors For The 
Production Of Helper Virus Containing LoxP Sites 



10 In the previous example, a shuttle vector containing adenoviral sequences extending 

from 9 mu to 16 mu of the Ad5 genome was modified to contain loxP sequences surrounding 
the packaging signals (the LoxP shuttle vector). This modified shuttle vector was then 
recombined with an Ad virus to produce a helper virus containing the loxP sequences. This 
helper virus was then used to infect cells expressing Cre recombinase along with DNA 
1 5 comprising a minichromosome containing the dystrophin gene, a reporter gene and the 

packaging signals and ITRs of Ad in order to preferentially package the minichromosomes. 
Using this approach the helper virus, which has had the packaging signals removed by Cra- 
loxP recombination, contains the majority of the Ad genome (only a portion of the E3 region 
is deleted). Thus, if low levels of helper virus are packaged and appear in the EAM 
20 preparation, the EAM preparation has the potential of passing on helper virus capable of 

directing the expression of Ad proteins in cells which are exposed to the EAM preparation. 

The expression of Ad proteins may lead to an immune response directed against the infected 
cells. 

Another approach to reducing the possibility that the EAM preparation contains helper 
25 virus capable of provoking an immune response is to use helper viruses containing deletions 
and/or mutations within the pol and pTP genes. Helper virus containing a deletion in the pol 
and/or pTP genes is cotransfected with the EAM construct into 293-derived cell lines 
expressing pol or pol and pTP to produce EAMs. Any helper virus present in the purified 
EAM preparation will be replication defective due to the deletion in the pol and/or pTP genes. 
30 As shown in Example 3, viruses containing a deletion in the pol gene are incapable of 

directing the expression of viral late genes: therefore, helper viruses containing a deletion in 
the pol gene or the pol and pTP genes should not be capable of provoking an immune 
response (/.t\. a CTL response) against late viral proteins synthesized da novo. Shuttle 
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vectors contain deletions within the Ad pol and/or pTP genes are constructed as described 
below. 

a) Construction Of A Shuttle Vector Containing The Apol 
Deletion 

pAdBglll was modified to contain sequences corresponding to 9 to 40 mu of the Ad5 
genome as follows. pAdBglll was digested with Bglll and a linker/adapter containing an ,4scl 
site was added to create pAdBglUAsc. pAdBglllAsc was then digested with Bstl 1071. 
Ad5t//7001 viral DNA was digested with Ascl and the ends were tilled in using T4 DNA 
polymerase. The .‘BrI -digested. T4 DNA polymerase filled Ad5r//7001 viral DNA was then 
digested with Bst \.\ 071 and the -9.9 kb Ascl-Bstl 107I(f,IIed) fragment containing the pol and 
pTP genes was isolated (as described in Ex. 3) and ligated to the Bst\ 1071-digested 
pAdBglllAsc to generate pAdAsc. pAdAsc is a shuttle vector which contains the genes 
encoding the DNA polymerase and preterminal protein (the inserted Ascl- Bstl 1071 fragment 
corresponds to nucleotides 5767 to 15. 671 of the Ad5 genome). 

A shuttle vector. pAdAscApol. which contains the 612 bp deletion in the pol gene 
(described in Ex. 3) was constructed as follows. Ad5ApolAE3I viral DNA (Ex. 3) was 
digested with Ascl and the ends were filled in using T4 DNA polymerase. The /Bcl-dieested. 
T4 DNA polymerase filled Ad5ApolAE31 viral DNA was then digested with £.*11071 and the 
'-9.j> kb .-NrI-S.y/1 107I(filled) fragment containing the deleted pol gene and the pTP gene was 
isolated (as described in Ex. 3) and ligated to Bstl 1071-digested pAdBglllAsc to generate 
pAdAscApol. 

b) Construction Of A Shuttle Vector Containing The ApolApTP 
Deletion 

A shuttle vector. pAdAscApolApTP. which contains a 2.3 kb deletion within the pol 
and pTP genes (described in Ex. 5) was constructed as follows. 

pAXBApolApTPVARNA+tl 3 (Ex. 5b) was digested with Ascl and the ends were filled in 
using T4 DNA polymerase. The .4scI-digested. T4 DNA polymerase filled 
pAXBApolApTPV ARN A+t 1 3 DNA was then digested with Bstl 1071 and the -7.6 kb Ascl- 
Z?.v/ 1 107I( filled) fragment containing the deleted pol gene and the pTP gene was isolated (as 
described in Ex. 3) and ligated to S.sVl 1071-digested pAdBglllAsc to generate 
pAdAscApolApTP. 

In order to reduce the packaging ot the above helper viruses, the pol - or pol - . pTP - 
helper \iruses can be modified to incorporate loxP sequences on either side of the packaging 
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signals as outlined in Example 7. The /oxE-containing pol' or pol'. pTP" shunle vectors 
(pTP- shuttle vectors may also be employed) are cotransfected into 293 cells expressing Cre 
recombinase and pol or Cre recombinase. pol and pTP. respectively along with an appropriate 
El - viral DNA-TPC (the El- viral DNA may also contain deletions elsewhere in the genome 
such as in the E4 genes or in the E2a gene as packaging cell lines expressing the E4 ORE 6 
or 6/7 and lines expressing E2a genes are avaialble) to generate helper virus containing loxP 
sites flanking the packaging signals as well as a deletion in the pol gene or the pol and pTP 
genes. The resulting helper virus(es) is used to cotransfect 293 cells expressing pol or pol 
and pTP along with the desired EAM construct. The resulting EAM preparation should 
contain little if any helper virus and any contaminating helper virus present would be 
replication defective and incapable of expressing viral late gene products. Helper viruses 
containing loxP sequences and deletions in all essential early genes may be employed in 
conjunction with Cre recombinanse-expressing cell lines expressing in trans the El. E4 ORE 
6. E2a. and E2b (e.g.. Ad polymerase and pTP) proteins (the E3 proteins are dispensible for 
growth in culture). Cell lines coexpressing El. polymerase and pTP are provided herein. 

Cell lines expressing El and E4 proteins have been recently described [Krougliak and Graham 
(1995) Hum. Gene Ther. 6:1575 and Wang et al. (1995) Gene Ther. 2:775) and cell lines 
expressing El and E2a proteins have been recently described [Zhou ei al. (1996) J. Virol. 
70:7030], Therefore, a cell line co-expressing El. E2a. E2b. and E4 is consructed by 
introduction of expression plasmids containing the E2a and E4 coding regions into the E1-. 

Ad polymerase- and pTP-expressing cell lines of the present invention. These packaging cell 
lines are used in conjunction with helper viruses containing deletions in the El. E2a. E2b and 
E4 regions. 

All publications and patents mentioned in the above specification are herein 
incorporated by reference. Various modifications and variations of the described method and 
system of the invention will be apparent to those skilled in the an without departing from the 
scope and spirit of the invention. Although the invention has been described in connection 
with specific preferred embodiments, it should be understood that the invention as claimed 
should not be unduly limited to such specific embodiments. Indeed, various modifications of 
the described modes for carrying out the invention which are obvious to those skilled in 
molecular biology or related fields are intended to be within the scope of the following 
claims. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: Chamberlain , Jeffrey S. 

Amalfi nano, Andrea 
Hauser. Michael A. 

Kumar-Singh. Rajendra 
Hartigan-O' Connor , Dennis J. 

(ii) TITLE OF INVENTION: IMPROVED ADENOVIRUS VECTORS 

(iii) NUMBER OF SEQUENCES: 15 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE : Medlen & Carroll, LLP 

(B) STREET: 220 Montgomery Street, Suite 2200 

(C) CITY: San Francisco 

(D) STATE: California 

(E) COUNTRY: United States Of America 

(F) ZIP: 94104 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC- DOS/MS - DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 

<vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: US 

(B) FILING DATE: 

(C) CLASSIFICATION : 

(viii) ATTORNEY/ AGENT INFORMATION: 

(A) NAME: Ingolia, Diane E. 

(B) REGISTRATION NUMBER: 40,027 

(C) REFERENCE/DOCKET NUMBER: UM-02484 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (415) 705-8410 

(B) TELEFAX: (415) 397-8338 

(2) INFORMATION FOR SEQ ID NO : 1 : 

< i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 35935 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nuclei c acid 
(A) DESCRIPTION : /d esc = M DNA M 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 

CATCATCAAT AATATACCTT ATTTTGGATT GAAGCCAATA TGATAATGAG GGGGTGGAGT 

TTGTGACGTG GCGCGGGGCG T GGGAACGGG GCGGGTGACG TAGTAGTGTG GCGGAAGTGT 

GATGTTGCAA GTGTGGCGGA ACACATGTAA GCGACGGATG TGGCAAAAGT GACGTTTTTG 

GTGTGCGCCG GTGTACACAG GAAGTGACAA TTTTCGCGCG GTTTTAGGCG GATGTTGTAG 

TAAATTTGGG CGTAACCGAG TAAGATTTGG CCATTTTCGC GGGAAAACTG AATAAGAGGA 

AGTGAAATCT GAATAATTTT GTGTTACTCA T AGCGCGTAA TATTTGTCTA GGGCCG CGGG 

GACTTTGACC GTTTACGTGG AGACTCGCCC AGGTGTTTTT CTCAGGTGTT TTCCGCGTTC 

CGGGTCAAAG TTGGCGTTTT ATTATTATAG TCAGCTGACG TGTAGTGTAT TTATACCCGG 
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TGAGTTCCTC 

TCCGACACCG 

AATGGCCGCC 

TCCTAGCCAT 

CGAAGATCCC 

GCAGGAAGGG 

cctttcccgg 

CCTTGTACCG 

CGAGGATGAA 

CAGGTCTTGT 

ctatatgagg 

TAGAGTGGTG 

gaattttgta 

CCAGAACCGG 

CGCCCGACAT 

CCTTCTAACA 

GCCGTGAGAG 

CCTGGGCAAC 

TTGCGTGTGT 

GAGATAATGT 

CGCCGTGGGC 

TTTTCTGCTG 

TTTCTGTGGG 

GAATTTGAAG 

CAGGCGCTTT 

GCGGCTGCTG 

AGCGGGGGGT 

AAGAATCGCC 

CAGCAGCAGC 

GCCGGCCTGG 

GACGCATTTT 

GGGCTTGTGA 

GTCCTGAGTG 

TGGCGCAGAA 

TTGAGGAGGC 
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AAGAGGCCAC 

GGACTGAAAA 

AGTCTTTTGG 

TTTGAACCAC 

AACGAGGAGG 

ATTGACTTAC 

CAGCCCGAGC 

GAGGTGATCG 

GAGGGTGAGG 

CATTATCACC 

ACCTGTGGCA 

GGTTTGGTGT 

TTGTGATTTT 

AGCCTGCAAG 

CACCTGTGTC 

CACCTCCTGA 

TTGGTGGGCG 

CTTTGGACTT 

GGTTAACGCC 

TTAACTTGCA 

TAATCTTGGT 

TGCGTAACTT 

GCTCATCCCA 

AGCTTTTGAA 

TCCAAGAGAA 

ACCTGCTGGA 

TGCTACTGTT 

AGGAGGAAGC 

ACCCTCGGGA 

GACAATTACA 

GGCTACAGAG 

TATTACTTTT 

GTATTCCATA 

TATTAGGGTA 



TCTTGAGTGC 

TGAGACATAT 

ACCAGCTGAT 

CTACCCTTCA 

CGGTTTCGCA 

TCACTTTTCC 

AGCCGGAGCA 

ATCTTACCTG 

AGTTTGTGTT 

GGAGGAATAC 

TGTTTGTCTA 

GGTAATTTTT 

TTTAAAAGGT 

ACCTACCCGC 

TAGAGAATGC 

GATACACCCG 

TCGCCAGGCT 

GAGCTGTAAA 

TTTGTTTGCT 

TGGCGTGTTA 

T ACATCTG AC 

GCTGGAACAG 

GGCAAAGTTA 

ATCCTGTGGT 

GGTCATCAAG 

GAGTTTTATA 

TTTTCTGGCC 

GTCTTCCGTC 

CAGGCGGCGG 

ATGAATGTTG 

GAGGATGGGC 

GAGGCTAGGA 

CAACAGATCA 

GAGCAGCTGA 

TATGCAAAGG 



CAGCGAGTAG 

TATCTGCCAC 

CGAAGAGGTA 

CGAACTGTAT 

GATTTTTCCC 

GCCGGCGCCC 

GAGAGCCTTG 

CCACGAGGCT 

AGATTATGTG 

GGGGGACCCA 

CAGTAAGTGA 

TTTTTAATTT 

CCTGTGTCTG 

CGTCCTAAAA 

AATAGTAGTA 

GTGGTCCCGC 

GTGGAATGTA 

CGCCCCAGGC 

GAATGAGTTG 

AATGGGGCGG 

CTCATGGAGG 

AGCTCTAACA 

GTCTGCAGAA 

GAGCTGTTTG 

ACTTTGGATT 

AAGGATAAAT 

ATGCATCTGT 

CGCCCGGCGA 

CGGCAGGAGC 

TACAGGTGGC 

AGGGG CT AAA 

ATCTAGCTTT 

AGGATAATTG 

CCACTTACTG 

TGGCACTTAG 



AGTTTTCTCC 

GGAGGTGTTA 

CTGGCTGATA 

GATTTAGACG 

GACTCTGTAA 

GGTTCTCCGG 

GGTCCGGTTT 

GGCTTTCCAC 

GAGCACCCCG 

GATATTATGT 

AAATTATGGG 

TTACAGTTTT 

AACCTGAGCC 

TGGCGCCTGC 

CGGATAGCTG 

TGTGCCCCAT 

TCGAGGACTT 

CATAAGGTGT 

ATGTAAGTTT 

GGCTTAAAGG 

CTTGGGAGTG 

GTACCTCTTG 

TTAAGGAGGA 

ATTCTTTGAA 

TTTCCACACC 

GGAGCGAAGA 

GGAGAGCGGT 

TAATACCGAC 

AGAGCCCATG 

TGAACTGTAT 

GGGGGTAAAG 

TAGCTTAATG 

CGCTAATGAG 

GCTGCAGCCA 

GCCAGATTGC 



TCCGAGCCGC 

TTACCGAAGA 

ATCTTCCACC 

TGACGGCCCC 

TGTTGGCGGT 

AGCCGCCTCA 

CTATGCCAAA 

CCAGTGACGA 

GGCACGGTTG 

GTTCGCTTTG 

CAGTGGGTGA 

GTGGTTTAAA 

TGAGCCCGAG 

TATCCTGAGA 

TGACTCCGGT 

TAAACCAGTT 

GCTTAACGAG 

AAACCTGTGA 

AATAAAGGGT 

GTATATAATG 

TTTGGAAGAT 

GTTTTGGAGG 

TTACAAGTGG 

T CTGGGTC AC 

GGGGCGCGCT 

AACCCATCTG 

TGTGAGACAC 

GGAGGAGCAG 

GAACCCGAGA 

CCAGAACTGA 

AGGGAGCGGG 

ACCAGACACC 

CTTGATCTGC 

GGGGATGATT 

AAGTACAAGA 
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TCAGCAAACT TGTAAATATC AGGAATTGTT GCTACATTTC TGGGAACGGG GCCGAGGTGG 2640 
AGATAGATAC GGAGGATAGG GTGGCCTTTA GATGTAGCAT GAT AAATATG TGGCCGGGGG 2700 

TGCTTGGCAT GGACGGGGTG GTTATTATGA ATGTAAGGTT TACTGGCCCC AATTTTAGCG 2760 

GTACGGTTTT CCTGGCCAAT ACCAACCTTA TCCTACACGG TGTAAGCTTC TATGGGTTTA 2820 

ACAATACCTG TGTGGAAGCC TGGACCGATG TAAGGGTTCG GGGCTGTGCC TTTTACTGCT 2880 

GCTGGAAGGG GGTGGTGTGT CGCCCCAAAA GCAGGGCTTC AATTAAGAAA TGCCTCTTTG 2940 

AAAGGTGTAC CTTGGGTATC CTGTCTGAGG GTAACTCCAG GGTGCGCCAC AATGTGGCCT 3000 

CCGACTGTGG TTG CTTCATG CTAGTGAAAA GCGTGGCTGT GATT AAG CAT AACATGGTAT 3 06 0 

GTGGCAACTG CGAGGACAGG GCCTCTCAGA TGCTGACCTG CTCGGACGGC AACTGTCACC 3120 

TGCTGAAGAC CATTCACGTA GCCAGCCACT CTCGCAAGGC CTGGCCAGTG TTTGAGCATA 3180 

ACATACTGAC CCGCTGTTCC TTGCATTTGG GTAACAGGAG GGGGGTGTTC CTACCTTACC 3240 

AATGCAATTT GAGTGACACT AAGATATTGC TTGAGCCCGA GAGCATGTCC AAGGTGAACC 3300 

TGAACGGGGT GTTTGACATG ACCATGAAGA TCTGGAAGGT GCTGAGGTAC GATGAGACCC 3360 

GCACCAGGTG CAGACCCTGC GAGTGTGGCG GTAAACATAT TAGGAACCAG CCTGTGATGC 3420 

TGGATGTGAC CGAGGAGCTG AGGCCCGATC ACTTGGTGCT GGCCTGCACC CGCGCTGAGT 3480 

TTGGCTCTAG CGATGAAGAT ACAGATTGAG GTACTGAAAT GTGTGGGCGT GGCTTAAGGG 3540 

TGGGAAAGAA TATATAAGGT GGGGGTCTTA TGTAGTTTTG TATCTGTTTT GCAGCAGCCG 3600 

CCGCCGCCAT GAGCACCAAC TCGTTTGATG GAAGCATTGT GAGCTCATAT TTGACAACGC 3660 

GCATGCCCCC ATGGGCCGGG GTGCGTCAGA ATGTGATGGG CTCCAGCATT GATGGTCGCC 3720 

CCGTCCTGCC CGCAAACTCT ACTACCTTGA CCTACGAGAC CGTGTCTGGA ACGCCGTTGG 3780 

AGACTGCAGC CTCCGCCGCC GCTTCAGCCG CTGCAGCCAC CGCCCGCGGG ATTGTGACTG 3840 

ACTTTGCTTT CCTGAGCCCG CTTGCAAGCA GTGCAGCTTC CCGTTCATCC GCCCGCGATG 3900 

ACAAGTTGAC GGCTCTTTTG GCACAATTGG ATT CTTTG AC CCGGGAACTT AATGTCGTTT 3 96 0 

CfCAGCAGCT GTTGGATCTG CGCCAGCAGG TTTCTGCCCT GAAGGCTTCC TCCCCTCCCA 4020 

ATGCGGTTTA AAACATAAAT AAAAAACCAG ACTCTGTTTG GATTTGGATC AAGCAAGTGT 4080 

CTTGCTGTCT TTATTTAGGG GTTTTGCGCG CGCGGTAGGC CCGGGACCAG CGGTCTCGGT 4140 

CGTTGAGGGT CCTGTGTATT TTTTCCAGGA CGTGGTAAAG GTGACTCTGG ATGTTCAGAT 4200 

ACATGGGCAT AAGCCGGTCT CTGGGGTGGA GGTAGCACCA CTGCAGAGCT TCATGCTGCG 4260 

GGGTGGTGTT GTAGATGATC CAGTCGTAGC AGGAGCGCTG GGCGTGGTGC CTAAAAATGT 4320 

CTTTCAGTAG CAAG CTG ATT GCCAGGGGCA GGCCCTTGGT GTAAGTGTTT ACAAAGCGGT 4 380 

TAAGCTGGGA TGGGTGCATA CGTGGGGATA TGAGATGCAT CTTGGACTGT ATTTTTAGGT 4440 

TGGCTATGTT CCCAGCCATA TCCCTCCGGG GATTCATGTT GTGCAGAACC ACCAGCACAG 4500 

TGTATCCGGT GCACTTGGGA AATTTGTCAT GTAGCTTAGA AGGAAATGCG TGGAAGAACT 4560 

TGGAGACGCC CTTGTGACCT CCAAGATTTT CCATGCATTC GTCCATAATG ATGGCAATGG 4620 

GCCCACGGGC GGCGGCCTGG GCGAAGATAT TTCTGGGATC ACTAACGTCA TAGTTGTGTT 4680 
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CCAGGATGAG 

GTATAATGGT 

CTTTGAGTTC 

GGGTAGGGGA 

CGGTGGGCCC 

TGCCGTCATC 

CCCTGACCAA 

CAAAGTTTTT 

GCAGTTCCAG 

CTCCTCGTTT 

ACGGGCCAGG 

GGTGAAGGGG 

GGTGCTGAAG 

GTCATAGTCC 

GCCGCACGAG 

TTCCGGGGAG 

GGTGAGCTCT 

CTTACCTCTG 

CCCGTATACA 

AAACTCGGAC 

GGAGGGGTAG 

GTCGCCCTCT 

TGTTCCTGAA 

ATCGCTGTCT 

TTCTGCGCTA 

GGTGATGCCT 

AAGCTTGGTG 

GGTTTGGTTT 

GCGCGCAACG 

GCGCCAACCG 

GCGCTCGTTG 

TAGCTGCGTC 

GTCGAAGTAG 

AAGCGCGCGC 

GGCGTACATG 




ATCGTCATAG 

TCCATCCGGC 

AGATGGGGGG 

GATCAGCTGG 

GTAAATCACA 

CCTGAGCAGG 

ATCCGCCAGA 

CAACGGTTTG 

GCGGTCCCAC 

CGCGGGTTGG 

GTCATGTCTT 

TGCGCTCCGG 

CGCTGCCGGT 

AGCCCCTCCG 

GGGCAGTGCA 

TAGGCATCCG 

GGCCGTTCGG 

GTTTCCATGA 

GACTTGAGAG 

CACTCTGAGA 

CGGTCGTTGT 

TCGGCATCAA 

GGGGGGCTAT 

GCGAGGGCCA 

AGATTGTCAG 

TTGAGGGTGG 

GCAAACGACC 

TTGTCGCGAT 

CACCGCCATT 

CGGTTGTGCA 

GTCCAGCAGA 

TCGTCCGGGG 

TCTATCTTGC 

TCGTATGGGT 

CCGCAAATGT 



GCCATTTTTA 
CCAGGGGCGT 
ATCATGTCTA 
GAAGAAAGCA 
CCTATTACCG 
GGGGCCACTT 
AGGCGCTCGC 
AGACCGTCCG 
AGCTCGGTCA 
GGCGGCTTTC 
TCCACGGGCG 
GCTGCGCGCT 
CTTCGCCCTG 
CGGCGTGGCC 
GACTTTTGAG 
CGCCGCAGGC 
GGTCAAAAAC 
GCCGGTGTCC 
GCCTGTCCTC 
CAAAGGCTCG 
CCACTAGGGG 
GGAAGGTGAT 
AAAAGGGGGT 
GCTGTTGGGG 
TTTCCAAAAA 
GCGCATCCAT 
CGTAGAGGGC 
CGGCGCGCTC 
CGGG AAAGAC 
GGGTGACAAG 
GGCGGCCGCC 
GGTCTGCGTC 
ATCCTTGCAA 
TGAGTGGGGG 
CGTAAACGTA 



CAAAGCGCGG 

AGTTACCCTC 

CCTGCGGGGC 

GGTTCCTGAG 

GGTGCAACTG 

CGTTAAGCAT 

CGCCCAGCGA 

CCGTAGGCAT 

cctgctctac 

GCTGTACGGC 
C AGGGTCCTC 
GGCCAGGGTG 
CGCG TCGGCC 
CTTGGCGCGC 
GGCGTAGAGC 
CCCGCAGACG 
CAGGTTTCCC 
ACGCTCGGTG 
GAGCGGTGTT 
CGTCCAGGCC 
GTCCACTCGC 
TGGTTTGTAG 
GGGGGCGCGT 
TGAGTACTCC 
CGAGGAGGAT 
CTGGTCAGAA 
GTTGGACAGC 
CTTGGCCGCG 
GGTGGTGCGC 
GTCAACGCTG 
CTTGCGCGAG 
CACGGTAAAG 
GTCTAGCGCC 
ACCCCATGGC 
GAGGGGCTCT 



GCGGAGGGTG 
ACAGATTTGC 
GATGAAGAAA 
CAGCTGCGAC 
GTAGTTAAGA 
GTCCCTGACT 
TAGCAGTTCT 
GCTTTTGAGC 
GGCATCTCGA 
AGTAGTCGGT 
GTCAGCGTAG 
CGCTTGAGGC 
AGGTAGCATT 
AGCTTGCCCT 
TTGGGCGCGA 
GTCTCGCATT 
CCATGCTTTT 
ACGAAAAGGC 
CCGCGGTCCT 
AGCACGAAGG 
TCCAGGGTGT 
GTGTAGGCCA 
TCGTCCTCAC 
CTCTGAAAAG 
TTG AT ATT C A 
AAGACAATCT 
AACTTGGCGA 
ATGTTTAGCT 
TCGTCGGGCA 
GTGGCTACCT 
CAGAATGGCG 
ACCCCGGGCA 
TGCTGCCATG 
ATGGGGTGGG 
CTGAGTATTC 



C C AG ACTGCG 
ATTTCCCACG 
ACGGTTTCCG 
TTACCGCAGC 
GAGCTGCAGC 
CGCATGTTTT 
TGCAAGGAAG 
GTTTGACCAA 
T C C AG CAT AT 
GCTCGTCCAG 
TCTGGGTCAC 
TGGTCCTGCT 
TGACCATGGT 
TGGAGGAGGC 
GAAATACCGA 
CCACGAGCCA 
TGATGCGTTT 
TGTCCGTGTC 
CCTCGTATAG 
AGGCTAAGTG 
GAAGACACAT 
CGTGACCGGG 
TCTCTTCCGC 
CGGGCATGAC 
CCTGGCCCGC 
TTTTGTTGTC 
TGGAGCGCAG 
GCACGTATTC 
CCAGGTGCAC 
CTCCGCGTAG 
GTAGGGGGTC 
GCAGGCGCGC 
CGGGGGCGGC 
TGAGCGCGGA 
CAAGATATGT 
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AGGGTAGCAT CTTCCACCGC GGATGCTGGC 
AGCGAGGAGG TCGGGACCGA GGTTGCTACG 
CCTGAAGATG GCATGTGAGT TGGATGATAT 
GTCTGTGAGA CCTACCGCGT CACGCACGAA 
CAGCTCGGCG GTGACCTGCA CGTCTAGGGC 
ATACTTATCC TGTCCCTTTT TTTTCCACAG 
TTTCCAGTAC TCTTGGATCG GAAACCCGTC 
GAACTGGTTG ACGGCCTGGT AGGCGCAGCA 
CGCGGCCTTC CGGAGCGAGG TGTGGGTGAG 
GTACTGGTAT TTGAAGTCAG TGTCGTCGCA 

GCGCTTTTTG GAACGCGGAT TTGGCAGGGC 

CGCGCGAGGC ATAAAGTTGC GTGTGATGCG 
AATTACCTGG GCGGCGAGCA CGATCTCGTC 
AAGTTCCAAG AAGCGCGGGA TGCCCTTGAT 
GAGCTCTTCA GGGGAGCTGA GCCCGTGCTC 
GGAAGCGACG AATGAGCTCC ACAGGTCACG 
GGTCCTAAAC TGGCGACCTA TGGCCATTTT 
GTCTTGTTCC CAGCGGTCCC ATCCAAGGTT 
AGGCTCATCT CCGCCGAACT TCATGACCAG 
CCCCATCCAA GTATAGGTCT CTACATCGTA 
CGAGCC3ATC GGGAAGAACT GGATCTCCCG 
GTGAAAGTAG AAGTCCCTGC GACGGGCCGA 
GCAGTACTGG CAGCGGTGCA CGGGCTGTAC 
CACAAGGAAG CAGAGTGGGA ATTTGAGCCC 
TACTTCGGCT GCTTGTCCTT GACCGTCTGG 
CACCACGCCG CGCGAGCCCA AAGTCCAGAT 
AACATCGCGC AGATGGGAGC TGTCCATGGT 
GAGCTCCTGC AGGTTTACCT CGCATAGACG 
CCTAATTTCC AGGGGCTGGT TGGTGGCGGC 
CGGCGCGACT ACGGTACCGC GCGGCGGGCG 
ATCTAAAAGC GGTGACGCGG GCGAGCCCCC 
AGAGGGGGCA GGGGCACGTC GGCGCCGCGC 
TTGCTGGCGA ACGCGACGAC GCGGCGGTTG 
ACGACGGGCC CGGTGAGCTT GAGCCTGAAA 
TTGACGGGGG CCTGGCGCAA AATCTCCTGC 



GCGCACGTAA TCGTATAGTT CGTGCGAGGG 
GGCGGGCTGC TCTGCTCGGA AGACTATCTG 
GGTTGGACGC TGGAAGACGT TGAAGCTGGC 
GGAGGCGTAG GAGTCGCGCA GCTTG TTG AC 
GCAGTAGTCC AGGGTTTCCT TGATGATGTC 
CTCGCGGTTG AGGACAAACT CTTCGCGGTC 
GGCCTCCGAA CGGTAAGAGC CTAGCATGTA 
TCCCTTTTCT ACGGGTAGCG CGTATGCCTG 
CGCAAAGGTG TCCCTGACCA TGACTTTGAG 
TCCGCCCTGC TCCCAGAGCA AAAAGTCCGT 
GAAGGTGACA TCGTTGAAGA GTATCTTTCC 
GAAGGGTCCC GGCACCTCGG AACGGTTGTT 
AAAGCCGTTG ATGTTGTGGC CCACAATGTA 
GGAAGGCAAT TTTTTAAGTT CCTCGTAGGT 
TGAAAGGGCC CAGTCTGCAA GATGAGGGTT 
GGCCATTAGC ATTTGCAGGT GGTCGCGAAA 
TTCTGGGGTG ATGCAGTAGA AGGTAAGCGG 
CGCGGCTAGG TCTCGCGCGG CAGTCACTAG 
CATGAAGGGC ACGAGCTGCT TCCCAAAGGC 
GGTGACAAAG AGACGCTCGG TGCGAGGATG 
CCACCAATTG GAGGAGTGGC TATTGATGTG 
ACACTCGTGC TGGCTTTTGT AAAAACGTGC 
ATCCTGCACG AGGTTGACCT GACGACCGCG 
CTCGCCTGGC GGGTTTGGCT GGTGGTCTTC 
CTGCTCGAGG GGAGTTACGG TGGATCGGAC 
GTCCGCGCGC GGCGGTCGGA GCTTGATGAC 
CTGGAGCTCC CGCGGCGTCA GGTCAGGCGG 
GGTCAGGGCG CGGGCTAGAT CCAGGTGATA 
GTCGATGGCT TGCAAGAGGC GGCATCCCCG 
GTGGGCCGCG GGGGTGTCCT TGGATGATGC 
GGAGGTAGGG GGGGCTCCGG ACCCGCCGGG 
GCGGGCAGGA GCTGGTGCTG CGCGCGTAGG 
ATCTCCTGAA TCTGGCGCCT CTGCGTGAAG 
GAGAGTTCGA CAGAATCAAT TTCGGTGTCG 
ACGTCTCCTG AGTTGTCTTG ATAGGCGATC 
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TCGGCCATGA ACTGCTCGAT CTCTTCCTCC TGGAGATCTC CGCGTCCGGC TCGCTCCACG 8940 

GTGGCGGCGA GGTCGTTGGA AATGCGGGCC ATGAGCTGCG AGAAGGCGTT GAGGCCTCCC 9000 

TCGTTCCAGA CGCGGCTGTA GACCACGCCC CCTTCGGCAT CGCGGG CGCG CATGACCACC 906 0 

TGCGCGAGAT TGAGCTCCAC GTGCCGGGCG AAGACGGCGT AGTTTCGCAG GCGCTGAAAG 9120 

AGGTAGTTGA GGGTGGTGGC GGTGTGTTCT GCCACGAAGA AGTACATAAC CCAGCGTCGC 9180 

AACGTGGATT CGTTGATATC CCCCAAGGCC TCAAGGCGCT CCATGGCCTC GTAGAAGTCC 9240 

ACGGCGAAGT TGAAAAACTG GGAGTTGCGC GCCGACACGG TTAACTCCTC CTCCAGAAGA 9300 

CGGATGAGCT CGGCGACAGT GTCGCGCACC TCGCGCTCAA AGGCTACAGG GGCCTCTTCT 9360 

TCTTCTTCAA TCTCCTCTTC CATAAGGGCC TCCCCTTCTT CTTCTTCTGG CGGCGGTGGG 9420 

GGAGGGGGGA CACGGCGGCG ACGACGGCGC ACCGGGAGGC GGTCGACAAA GCGCTCGATC 9480 

ATCTCCCCGC GGCGACGGCG CATGGTCTCG GTGACGGCGC GGCCGTTCTC GCGGGGGCGC 9540 

AGTTGGAAGA CGCCGCCCGT CATGTCCCGG TTATGGGTTG GCGGGGGGCT GCCATGCGGC 9600 

AGGGATACGG CGCTAACGAT GCATCTCAAC AATTGTTGTG TAGGTACTCC GCCGCCGAGG 9660 

GACCTGAGCG AGTCCGCATC GACCGGATCG GAAAACCTCT CGAGAAAGGC GTCTAACCAG 9720 

TCACAGTCGC AAGGTAGGCT GAGCACCGTG GCGGGCGGCA GCGGGCGGCG GTGGGGGTTG 9780 

TTTCTGGCGG AGGTGCTGCT GATGATGTAA TTAAAGTAGG CGGTCTTGAG ACGGCGGATG 9840 

GTCGACAGAA GCACCATGTC CTTGGGTCCG GCCTGCTGAA TGCGCAGGCG GTCGGCCATG 9900 

CCCCAGGCTT CGTTTTGACA TCGGCGCAGG TCTTTGTAGT AGTCTTGCAT GAGCGTTTCT 9960 

ACCGGCACTT CTTCTTGTCC TTCCTCTTGT CCTGCATCTC TTGCATCTAT CGCTGCGGCG 10020 

GCGGCGGAGT TTGGCCGTAG GTGGCGCCCT CTTGCTCCCA TGCGTGTGAC CCCGAAGCCC 10080 

CTCATCGGCT GAAGCAGGGC TAGGTCGGCG ACAACGCGCT CGGCTAATAT GGCCTGCTGC 10140 

ACCTGCGTGA GGGTAGACTG GAAGTCATCG ATGTCCACAA AGCGGTGGTA TGCGCCCGTG 10200 

TTGATGGTGT AAGTGCAGTT GGCCATAACG GACCAGTTAA CGGTCTGGTG ACCCGGCTGC 10260 

GAGAGCTCGG TGTACCTGAG ACGCGAGTAA GCCCTCGAGT CAAATACGTA GTGGTTGCAA 10320 

GTCCGCACCA GGT ACTGGTA TCCCACCAAA AAGTGCGGCG GCGGCTGGCG GTAGAGGGGC 103 8 0 

CAGCGTAGGG TGGCCGGGGC TCCGGGGGCG AGATCTTCCA ACATAAGGCG ATGATATCCG 10440 

TAGATGTACC TGGACATCCA GGTGATGCCG GCGGCGGTGG TGGAGGCGCG CGGAAAGTGG 10S00 

CGGACGCGGT TCCAGATGTT GCGCAGCGGC AAAAAGTGCT CCATGGTCGG GACGCTCTGG 10560 

CCGGTCAGGC GCGCGCAATC GTTGACGCTC TAGACCGTGC AAAAGGAGAG CCTGTAAGCG 10620 

GGCACTGTTC CGTGGTCTGG TGGATAAATT CGCAAGGGTA TCATGGCGGA CGACCGGGGT 10680 

TCGAGCCGCG TATCCGGCCG TCCGCCGTGA TCCATGCGGT TACCGCCCGC GTGTGGAACC 10740 

CAGGTGTGCG ACGTCAGACA ACGGGGGAGT GCTCCTTTTG GCTTCCTTCC AGGCGCGGCG 10800 

GCTGCTGCGC TAGCTTTTTT GGCCACTGGC CGCGCGCAGC GTAAGCGGTT AGGCTGGAAA 10860 

GCGAAAGCAT TAAGTGGCTC GCTCCCTGTA GCCGGAGGGT TATTTTCCAA GGGTTGAGTC 10920 

GCGGGACCCC CGGTTCGAGT CTCGGACCGG CCGGACTGCG GCGAACGGGG GTTTGCCTCC 10980 




WO 98/17783 

CCGTCATGCA 

TTTTCCCAGA 
CAAGAGCAGC 
ACAT CCGCGG 
CACTACCTGG 
CGGTACCCAA 
CTGTTTCGCG 
GGGCGCGAGC 
CCCGACGCGC 
ACCGCATACG 
GTGCG7ACGC 
\ , GTAAGCGCGC 
GTGCAGCACA 
GAGGGCCGCT 
AGCTTGAGCC 
TTTTACGCCC 
GAGGGGTTCT 
TATCGCAACG 
CGCGAGCTGA 
GCCGAGTCCT 
GAGGCAGCTG 
GGCGTGGAGG 
GTGATGTTTC 
AGAGCCAGCC 
_ TGTCGCTGAC 
... CCGCAATTCT 
CGATCGTAAA 
ACGACGCGCT 
ACCGGCTGGT 
GCAACCTGGG 
^ CGCGGGGACA 
CACCGCAAAG 
GCCTGCAGAC 
GGGCTCCCAC 
TGCTGCTGCT 



AGACCCCGC7 
TGCATCCGGT 
GGCAGACATG 
TTGACGCGGC 
ACTTGGAGGA 
GGGTGCAGCT 
ACCGCGAGGG 
TGCGGCATGG 
GAACCGGGAT 
AGCAGACGGT 
TTGTGGCGCG 
TGGAGCAAAA 
GCAGGGACAA 
GGCTGCTCGA 
TGGCTGACAA 
GCAAGATATA 
ACATGCGCAT 
AGCGCATCCA 
TGCACAGCCT 
ACTTTG ACG C 
GGGCCGG ACC 
AATATGACGA 
TGATCAGATG 
GTCCGGCCTT 
TGCGCGCAAT 
GGAAGCGGTG 
CGCGCTGGCC 
GCTTCAGCGC 
GGGGGATGTG 
CTCCATGGTT 
GGAGGACTAC 
TGAGGTGTAC 
CGTAAACCTG 
AGGCGACCGC 
AATAGCGCCC 



TGCAAATTCC 
GCTGCGGCAG 
CAGGGCACCC 
AGCAGATGGT 
GGGCGAGGGC 
GAAGCGTGAT 
AG AGG AG C C C 
CCTGAATCGC 
TAGTCCCGCG 
GAACCAGGAG 
CGAGGAGGTG 
CCCAAATAGC 
CGAGGCATTC 
TTTGATAAAC 
GGTGGCCGCC 
CCATACCCCT 
GGCGCTGAAG 
CAAGGCCGTG 
GCAAA GGGCC 
GGGCGCTGAC 
TGGGCTGGCG 
GGACGATGAG 
ATGCAAGACG 
AACTCCACGG 
CCTGACGCGT 
GTCCCGGCGC 
GAAAACAGGG 
GTGGCTCGTT 
CGCGAGGCCG 
GCACTAAACG 
ACCAACTTTG 
CAGTCTGGGC 
AGCCAGGCTT 
GCGACCGTGT 
TTCACGGACA 






TCCGGAAACA 
ATGCGCCCCC 
TCCCCTCCTC 
GATTACGAAC 
CTGGCGCGGC 
ACGCGTGAGG 
GAGGAGATGC 
GAGCGGTTGC 
CGCGCACACG 
ATTAACTTTC 
GCTATAGGAC 
AAGCCGCTCA 
AGGGATGCGC 
ATGCTGCAGA 
ATCAACTATT 
TACGTTCCCA 
GTGCTTACCT 
AGCGTGAGCC 
CTGGCTGGCA 
CTGCGCTGGG 
GTGGCACCCG 
TACGAGCCAG 
CAACGGACCC 
ACGACTGGCG 
TCCGGCAGCA 
GCGCAAACCC 
CCATCCGGCC 
ACAACA GCGG 
TGGCGCAGGG 
C CTTCCTGAG 
TGAGCGCACT 
CAGACTATTT 
TCAAAAACTT 
CTAGCTTGCT 
GTGGCAGCGT 



PCT/US97/19541 

GGGACGAGCC CCTTTTTTGC 11040 
CTCCTCAGCA GCGGCAAGAG 11100 
CTACCGCGTC AGGAGGGGCG 11160 
CCCCGCGGCG CCGGGCCCGG 11220 
TAG GAG CG C C CTCTCCTGAG 112 8 0 
CGTACGTGCC GCGGCAGAAC 11340 
GGGATCGAAA GTTCCACGCA 11400 
TGCGCGAGGA GGACTTTGAG 11460 
TGGCGGCCGC CGACCTGGTA 11520 
AAAAAAGCTT TAACAACCAC 11580 
TG ATGCATCT GTGGGACTTT 11640 
TGGCGCAGCT GTTCCTTATA 11700 
TGCTAAACAT AGTAGAGCCC 11760 

GCATAGTGGT GCAGGAGCGC 11820 

CCATGCTTAG CCTGGGCAAG 11880 

TAGACAAGGA GGTAAAGATC 11940 

TGAGCGACGA CCTGGGCGTT 12000 

GGCGGCGCGA GCTCAGCGAC 12060 

CGGGCAGCGG CGATAGAGAG 12120 

CCCCAA GCCG ACGCGCCCTG 12180 

CGCGCGCTGG CAACGTCGGC 12240 

AGGACGGCGA GTACTAAGCG 12300 

GGCGGTGCGG GCGGCGCTGC 12360 

GCAGGTCATG GACCGCATCA 12420 

GCCGCAGGCC AACCGGCTCT 12480 

CACGCACGAG AAGGTGCTGG 12 540 

CGACGAGGCC GGCCTGGTCT 12600 

CAACGTGCAG ACCAACCTGG 12 66 0 

TGAGCGCGCG CAGCAGCAGG 1272 0 

TAG AC AG CC C GCGAACGTGC 12780 

GGGGCTAATG GTGACTGAGA 12840 

TTTCCAGACC AGTAGACAAG 12 900 

GCAGGGGCTG TGGGGGGTGC 12960 

GACGCCCAAG TCGCGCCTGT 13020 

GTGGCGGGAC AC AT AC CTAG 13 08 0 
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GTCACTTGCT 


GACACTGTAC 


CGCGAGGCCA 


TAGGTCAGGC 


GCATGTGGAC 


G AG CAT ACTT 


13140 


TCC AGGAGAT 


TACAAGTGTC 


AGCCGCGCGC 


TGGGGCAGGA 


GGACACGGGC 


AGCCTGGAGG 


13200 


CAACCCTAAA 


CTACCTGCTG 


ACCAACCGGC 


GGCAGAAGAT 


CCCCTCGTTG 


CACAGTTTAA 


13260 


ACAGCGAGGA 


GGAGCGCATT 


TTGCGCTACG 


TGCAGCAGAG 


CGTGAGCCTT 


AACCTGATGC 


13320 


GCGACGGGGT 


AACGCCCAGC 


GTGGCGCTGG 


ACATGACCGC 


GCGCAACATG 


GAACCGGGCA 


13380 


TGTATGCCTC 


AAACCGGCCG 


TTTATCAACC 


GCCTAATGGA 


CTACTTGCAT 


CGCGCGGCCG 


13440 


CCGTGAACCC 


C GAG T ATTTC 


ACCAATGCCA 


TCTTGAACCC 


GCACTGGCTA 


CCGCCCCCTG 


13500 


GTTTCTACAC 


CGGGGGATTC 


GAGGTGCCCG 


AGGGTAACGA 


TGGATTCCTC 


TGGGACGACA 


13560 


TAGACGACAG 


CGTGTTTTCC 


CCGCAACCGC 


AGACCCTGCT 


AGAGTTGCAA 


CAGCGCGAGC 


13620 


AGGCAGAGGC 


GGCGCTGCGA 


AAGGAAAGCT 


TCCGCAGGCC 


AAGCAGCTTG 


TCCGATCTAG 


13680 


GCGCTGCGGC 


CCCGCGGTCA 


GATGCTAGTA 


GCCCATTTCC 


AAG CTTGATA 


GGGTCTCTTA 


13740 


CCAGCACTCG 


CACCACCCGC 


CCGCGCCTGC 


TGGGCGAGGA 


GGAGTACCTA 


AACAACTCGC 


13800 


TGCTGCAGCC 


GCAGCGCGAA 


AAAAACCTGC 


CTCCGGCATT 


TCCCAACAAC 


GGGATAGAGA 


13860 


GCCTAGTGGA 


CAAGATGAGT 


AGATGGAAGA 


CGTACGCGCA 


GGAGCACAGG 


GACGTGCCAG 


13920 


GCCCGCGCCC 


GCCCACCCGT 


CGTCAAAGGC 


ACGACCGTCA 


GCGGGGTCTG 


GTGTGGGAGG 


13980 


ACGATGACTC 


GGCAGACGAC 


AGCAGCGTCC 


TGGATTTGGG 


AGGGAGTGGC 


AACCCGTTTG 


14040 


CGCACCTTCG 


CCCCAGGCTG 


GGGAGAATGT 


TTTAAAAAAA 


AAAAAGCATG 


ATGCAAAATA 


14100 


AAAAACTCAC 


CAAGGCCATG 


GCACCGAGCG 


TTGGTTTTCT 


TGTATTCCCC 


TTAGTATGCG 


14160 


GCGCGCGGCG 


ATGTATGAGG 


AAGGTCCTCC 


TCCCTCCTAC 


GAGAGTGTGG 


TGAGCGCGGC 


14220 


GCCAGTGGCG 


GCGGCGCTGG 


GTTCTCCCTT 


CGATGCTCCC 


CTGGACCCGC 


CGTTTGTGCC 


14280 


TCCGCGGTAC 


CTGCGGCCTA 


CCGGGGGGAG 


AAACAGCATC 


CGTTACTCTG 


AGTTGGCACC 


14340 


CCTATTCGAC 


ACCACCCGTG 


TGTACCTGGT 


GGACAACAAG 


TCAACGGATG 


TGGCATCCCT 


14400 


GAACTACCAG 


AACGACCACA 


GCAACTTTCT 


GACCACGGTC 


ATTCAAAACA 


ATGACTACAG 


14460 


CCCGGGGGAG 


GCAAGCACAC 


AGACCATCAA 


TCTTGACGAC 


CGGTCGCACT 


GGGGCGGCGA 


14520 


CCTGAAAACC 


ATCCTGCATA 


CCAACATGCC 


AAATGTGAAC 


GAGTTCATGT 


TTACCAATAA 


14580 


GTTTAAGGCG 


CGGGTGATGG 


TGTCGCGCTT 


GCCTACTAAG 


GACAATCAGG 


TGGAGCTGAA 


14640 


ATACGAGTGG 


GTGGAGTTCA 


CGCTGCCCGA 


GGGCAACTAC 


TCCGAGACCA 


TGACCATAGA 


14700 


CCTTATGAAC 


AACGCGATCG 


TGGAGCACTA 


CTTGAAAGTG 


GGCAGACAGA 


ACGGGGTTCT 


14760 


GGAAAGCGAC 


ATCGGGGTAA 


AGTTTGACAC 


CCGCAACTTC 


AGACTGGGGT 


TTGACCCCGT 


14820 


CACTGGTCTT 


GTCATGCCTG 


GGGTATATAC 


AAACGAAGCC 


TTCCATCCAG 


ACATCATTTT 


14880 


GCTGCCAGGA 


TGCGGGGTGG 


ACTTCACCCA 


CAGCCGCCTG 


AGCAACTTGT 


TGGGCATCCG 


14940 


CAAGCGGCAA 


CCCTTCCAGG 


AGGGCTTTAG 


GATCACCTAC 


GATGATCTGG 


AGGGTGGTAA 


15000 


CATTCCCGCA 


CTGTTGGATG 


TGGACGCCTA 


CCAGGCGAGC 


TTGAAAGATG 


ACACCGAACA 


15060 


GGGCGGGGGT 


GGCGCAGGCG 


GCAGCAACAG 


CAGTGGCAGC 


GGCGCGG AAG 


AGAACTCCAA 


15120 


CGCGGCAGCC 


GCGGCAATGC 


AGCCGGTGGA 


GGACATGAAC 


GATCATGCCA 


TTCGCGGCGA 


15180 
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C^vCCTTTGCC 


ACACGGGCTo 


! AGGAGAAGCG 


1 CGCTGAGGCC 


' GAAGCAGCGG 


CCGAAGCTGC 


15240 


CGCCCCCGCT 


' GCG CAACCCG 


AGGTCGAGAA 


• GCCTCAGAAG 


AAACCGGTGA 


TCAAACCCCT 


15300 


GACAGAGGAC 


AGCAAGAAAC 


GCAGTTACAA 


CCTAATAAGC 


AATGACAGCA 


CCTTCACCCA 


15360 


GTACCGCAGC 


TGGTACCTTG 


CATACAACTA 


CGGCGACCCT 


CAGACCGGAA 


TCCGCTCATG 


15420 


GACCCTGCTT 


TGCACTCCTG 


ACGTAACCTG 


CGGCTCGGAG 


CAGGTCTACT 


GGTCGTTGCC 


15480 


AGACATGATG 


CAAGACCCCG 


TGACCTTCCG 


CTCCACGCGC 


CAGATCAGCA 


ACTTTCCGGT 


15540 


GGTGGGCGCC 


GAGCTGTTGC 


CCGTGCACTC 


CAAGAGCTTC 


TACAACGACC 


AGGCCGTCTA 


15600 


CTCCCAACTC 


ATCCGCCAGT 


TTACCTCTCT 


GACCCACGTG 


TTCAATCGCT 


TTCCCGAGAA 


15660 


CCAGATTTTG 


GCGCGCCCGC 


CAGCCCCCAC 


CATCACCACC 


GTCAGTGAAA 


ACGTTCCTGC 


15720 


T CT C A C AG AT 


CACGGGACGC 


TACCGCTGCG 


CAACAGCATC 


GGAGGAGTCC 


AGCGAGTGAC 


15780 


CATTACTGAC 


GCCAGACGCC 


GCACCTGCCC 


CTACGTTTAC 


AAGGCCCTGG 


GCATAGTCTC 


15840 


GCCGCGCGTC 


CTATCGAGCC 


GCACTTTTTG 


AGCAAGCATG 


TCCATCCTTA 


TATCGCCCAG 


15900 


CAATAACACA 


GGCTGGGGCC 


TGCGCTTCCC 


AAGCAAGATG 


TTTGGCGGGG 


CCAAGAAGCG 


15960 


CTCCGACCAA 


CACCCAGTGC 


GCGTGCGCGG 


GCACTACCGC 


GCGCCCTGGG 


GCGCGCACAA 


16020 


ACGCGGCCGC 


ACTGGGCGCA 


CCACCGTCGA 


TGACGCCATC 


GACGCGGTGG 


TGGAGGAGGC 


16080 


GCGCAACTAC 


ACGCCCACGC • 


CGCCACCAGT 


GTCCACAGTG 


GACGCGGCCA 


TTCAGACCGT 


16140 


GGTGCGCGGA 


GCCCGGCGCT 


ATGCTAAAAT 


GAAGAGACGG 


CGGAGGCGCG 


TAGCACGTCG 


16200 


CCACCGCCGC 


CGACCCGGCA 


CTGCCGCCCA 


ACGCGCGGCG 


GCGGCCCTGC 


TTAACCGCGC 


16260 


ACGTCGCACC 


GGCCGACGGG 


CGGCCATGCG 


GGCCGCTCGA 


AGGCTGGCCG 


CGGGTATTGT 


16320 


CACTGTGCCC 


CCCAGGTCCA < 


GGCGACGAGC 


GGCCGCCGCA 


GCA GCCGCGG 


CCATTAGTGC 


16380 


TATGACTCAG < 


GGTCGCAGGG < 


GCAACGTGTA 


TTGGGTGCGC ' 


GACTCGGTTA ■ 


GCGGCCTGCG 


16440 



CGTGCCCGTG CGCACCCGCC CC CCGCGCAA CTAGATTGCA AGAAAAAACT ACTTAGACTC 16500 
GTACTGTTGT ATGTATCCAG CGGCGGCGGC GCGCAACGAA GCTATGTCCA AGCGCAAAAT 16560 
CAAAGAAGAG ATGCTCCAGG TCATCGCGCC GGAGATCTAT GGCCCCCCGA AGAAGGAAGA 16620 
GCAGGATTAC AAGCCCCGAA AGCTAAAGCG GGTCAAAAAG AAAAAGAAAG ATGATGATGA . 16680 
TGAACTTGAC GACGAGGTGG AACTGCTGCA CGCTACCGCG CCCAGGCGAC GGGTACAGTG 16740 
GAAAGGTCGA CGCGTAAAAC GTGTTTTGCG ACCCGGCACC ACCGTAGTCT TTACGCCCGG 16800 
TGAGCGCTCC ACCCGCACCT ACAAGCGCGT GTATGATGAG GTGTACGGCG ACGAGGACCT 16860 
GCTTGAGCAG GCCAACGAGC GCCTCGGGGA GTTTGCCTAC GGAAAGCGGC ATAAGGACAT 16920 
GCTGGCGTTG CCGCTGGACG AGGGCAACCC AACACCTAGC CTAAAGCCCG TAACACTGCA 16980 
GCAGGTGCTG CCCGCGCTTG CACCGTCCGA AGAAAAGCGC GGCCTAAAGC GCGAGTCTGG 17040 
TGACTTGGCA CCCACCGTGC AGCTGATGGT ACCCAAGCGC CAGCGACTGG AAGATGTCTT 17100 
GGAAAAAATG ACCGTGGAAC CTGGGCTGGA GCCCGAGGTC CGCGTGCGGC CAATCAAGCA 17160 
GGTGGCGCCG GGACTG GGCG TGCAGACCGT GGACGTTCAG ATACCCACTA CCAGTAGCAC 17220 



CAGTATTGCC ACCGCCACAG AGGGCATGGA GACACAAACG TCCCCGGTTG 



CCTCAGCGGT 



17280 
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GGCGGATGCC 


GCGGTGCAGG 


CGGTCGCTGC 


GGCCGCGTCC 


AAG ACCTCT A 


CGGAGGTGCA 


17340 


AACGGACCCG 


TGGATGTTTC 


GCGTTTCAGC 


CCCCCGGCGC 


CCGCGCGGTT 


CGAGGAAGTA 


17400 


CGGCGCCGCC 


AGCGCGCTAC 


TGCCCGAATA 


TGCCCTACAT 


ccttccattg 


CGCCTACCCC 


17460 


CGGCTATCGT 


GGCTACACCT 


ACCGCCCCAG 


AAGACGAGCA 


ACTACCCGAC 


GCCGAACCAC 


17520 


CACTGGAACC . 


CGCCGCCGCC 


GTCGCCGTCG 


CCAGCCCGTG 


CTGGCCCCGA 


TTTCCGTGCG 


17580 


CAGGGTGGCT 


CGCGAAGGAG 


GCAGGACCCT 


GGTGCTGCCA 


ACAGCGCGCT 


ACCACCCCAG 


17640 


CATCGTTTAA 


AAGCCGGTCT 


TTGTGGTTCT 


TG C AG AT ATG 


GCCCTCACCT 


GCCGCCTCCG 


17700 


TTTCCCGGTG 


CCGGGATTCC 


GAGGAAGAAT 


GCACCGTAGG 


AGGGGCATGG 


CCGGCCACGG 


17760 


CCTGACGGGC 


GGCATGCGTC 


GTGCGCACCA 


CCGGCGGCGG 


CGCGCGTCGC 


ACCGTCGCAT 


17820 


GCGCGGCGGT 


ATCCTGCCCC 


TCCTTATTCC 


ACTGATCGCC 


GCGGCGATTG 


GCGCCGTGCC 


17880 


CGGAATTGCA 


TCCGTGGCCT 


TGCAGGCGCA 


GAGACACTGA 


TTAAAAACAA 


GTTGCATGTG 


17940 


GAAAAATCAA 


AATAAAAAGT 


CTGGACTCTC 


ACGCTCGCTT 


GGTCCTGTAA 


CTATTTTGTA 


18000 


GAATGGAAGA 


CATCAACTTT 


GCGTCTCTGG 


CCCCGCGACA 


CGGCTCGCGC 


CCGTTCATGG 


18060 


GAAACTGGCA 


AGATATCGGC 


ACCAGCAATA 


TGAGCGGTGG 


CGCCTTCAGC 


TGGGGCTCGC 


18120 


TGTGGAGCGG 


CATTAAAAAT 


TTCGGTTCCA 


CCGTTAAGAA 


CT ATG G C AG C 


AAGGCCTGGA 


18180 


ACAGCAGCAC 


AGGCCAGATG 


CTGAGGGATA 


AGTTGAAAGA 


GCAAAATTTC 


CAACAAAAGG 


18240 


TGGTAGATGG 


CCTGGCCTCT 


GGCATTAGCG 


GGGTGGTGGA 


CCTGGCCAAC 


CAGGCAGTGC 


18300 


AAAATAAGAT 


TAACAGTAAG 


CTTGATCCCC 


GCCCTCCCGT 


AGAGGAGCCT 


CCACCGGCCG 


18360 


TGGAGACAGT 


GTCTCCAGAG 


GGGCGTGGCG 


AAAAGCGTCC 


GCGCCCCGAC 


AGGGAAGAAA 


18420 


CTCTGGTGAC 


GCAAATAGAC 


GAGCCTCCCT 


CGTACGAGGA 


GGCACTAAAG 


CAAGGCCTGC 


18480 


CCACCACCCG 


TCCCATCGCG 


CCC ATGGCTA 


CCGGAGTGCT 


GGGCCAGCAC 


ACACCCGTAA 


18540 


CGCTGGACCT 


GCCTCCCCCC 


GCGGACACCC 


AGCAGAAACC 


TGTGCTGCCA 


GGCCCGACCG 


18600 


CCGTTGTTGT 


AACCCGTCCT 


AGCCGCGCGT 


CCCTGCGCCG 


CGCCGCCAGC 


GGTCCGCGAT 


18660 


CGTTGCGGCC 


CGTAGCCAGT 


GGCAACTGGC 


AAAGCACACT 


GAACAGCATC 


GTGGGTCTGG 


18720 


GGGTGCAATC 


CCTGAAGCGC 


CGACGATGCT 


TCTGAATAGC 


TAACGTGTCG 


TATGTGTGTC 


18780 


ATGTATGCGT 


CCATGTCGCC 


GCCAGAGGAG 


CTGCTGAGCC 


GCCGCGCGCC 


CGCTTTCCAA 


18840 


GATGGCTACC 


CCTTCGATGA 


TGCCGCAGTG 


GTCTTACATG 


CACATCTCGG 


GCCAGGACGC 


18900 


CTCGGAGTAC 


CTGAGCCCCG 


GGCTGGTGCA 


GTTTGCCCGC 


GCCACCGAGA 


CGTACTTCAG 


18960 


CCTGAATAAC 


AAGTTTAGAA 


ACCCCACGGT 


GGCGCCTACG 


CACGACGTGA 


CCACAGACCG 


19020 


GTCCCAGCGT 


TTGACGCTGC 


GGTTCATCCC 


TGTGGACCGT 


GAGGATACTG 


CGTACTCGTA 


19080 


CAAGGCGCGG 


TTCACCCTAG 


CTGTGGGTGA 


T AAC CGTGTG 


CTGGACATGG 


CTTCCACGTA 


19140 


CTTTGACATC 


CGCGGCGTGC 


TGGACAGGGG 


CCCTAGTTTT 


AAGCCCTACT 


CTGGCACTGC 


19200 


CTACAACGCC 


CTGGCTCCCA 


AGGGTGCCCC 


AAATCCTTGC 


GAATGGGATG 


AAGCTGCTAC 


19260 


TGCTCTTGAA 


ATAAACCTAG 


AAGAAGAGGA 


CGATGACAAC 


GAAGACGAAG 


TAGACGAGCA 


19320 


AGCTGAGCAG 


CAAAAAACTC 


ACGTATTTGG 


GCAGGCGCCT 


TATTCTGGTA 


TAAATATTAC 


19380 




WO 98/17783 

AAAGGAGGGT 



TCAACCTGAA 
TGGGAGAGTC 
CACAAATGAA 
TCAAGTGGAA 
GACTCCTAAA 
TTCTTACATG 
GCCCAACAGG 
CAGCACGGGT 
TTTGCAAGAC 
AACCAGGTAC 
TATTGAAAAT 
GATTAATACA 
AAAAGATGCT 
GGAAATCAAT 
TTTGCCCGAC 
CTACGACTAC 
TGGAGCACGC 
TGCTGGCCTG 
CCAGGTGCCT 
CTACGAGTGG 
CCTAAGGGTT 
CCCCATGGCC 
CCAGTCCTTT 
TAC CAACGTG 
. CACGCGCCTT 
CTACTCTGGC 
GGTGGCCATT 
CAACGAGTTT 
CATGACCAAA 
CTTCTATATC 
CATGAGCCGT 
ACACCAACAC 
GGCCTACCCT 
CCAGAAAAA G 



ATTCAAATAG 

CCTCAAATAG 

CTTAAAAAGA 

AATGGAGGGC 

ATGCAATTTT 

GTGGTATTGT 

CCCACTATTA 

CCTAATTACA 

AATATGGGTG 

AGAAACACAG 

TTTTCTATGT 

CATGGAACTG 

GAGACTCTTA 

ACAGAATTTT 

CTAAATGCCA 

AAGCTAAAGT 

ATGAACAAGC 

TGGTCCCTTG 

CGCTACCGCT 

CAGAAGTTCT 

AACTTCAGGA 

GACGGAGCCA 

CACAACACCG 

AACGACTATC 

CCCATATCCA 

AAGACTAAGG 

TCTATACCCT 

ACCTTTGACT 

GAAATTAAGC 

GACTGGTTCC 

CCAGAGAGCT 

CAGGTGGTGG 

AACAACTCTG 

GCTAACTTCC 

TTTCTTTGCG 






GTGTCGAAGG 

GAGAATCTCA 

CTACCCCAAT 

AAGG CATTCT 

TCTCAACTAC 

ACAGTGAAGA 

AGGAAGGTAA 

TTGCTTTTAG 

TTCTGGCGGG 

AGCTTTCATA 

GGAATCAGGC 

AAGATGAACT 

CCAAGGTAAA 

CAGATAAAAA 

ACCTGTGGAG 

ACAGTCCTTC 

GAGTGGTGGC 

ACTATATGGA 

CAATGTTGCT 

TTGCCATTAA 

AGGATGTTAA 

GCATTAAGTT 

CCTCCACGCT 

TCTCCGCCGC 

TCCCCTCCCG 

AAACCCCATC 

ACCTAGATGG 

CTTCTGTCAG 

GCTCAGTTGA 

TGGTACAAAT 

ACAAGGACCG 

ATGATACTAA 

GATTTGTTGG 

CCTATCCGCT 

ATCGCACCCT 



TCAAACACCT 

GTGGTACGAA 

GAAACCATGT 

TGTAAAGCAA 

TGAGGCGACC 

TGTAGATATA 

CTCACGAGAA 

GGACAATTTT 

CCAAGCATCG 

CCAGCTTTTG 

TGTTGACAGC 

TCCAAATTAC 

ACCTAAAACA 

TGAAATAAGA 

AAATTTCCTG 

CAACGTAAAA 

TCCCGGGTTA 

CAACGTCAAC 

GGGCAATGGT 

AAACCTCCTT 

CATGGTTCTG 

TGATAGCATT 

TGAGGCCATG 

CAACATGCTC 

CAACTGGGCG 

ACTGGGCTCG 

AACCTTTTAC 

CTGGCCTGGC 

CGGGGAGGGT 

GCTAGCTAAC 

CATGTACTCC 

ATACAAGGAC 

CTACCTTGCC 

TATAGGCAAG 

TTGGCGCATC 



PCT/US97/1 954 1 

AAATATGCCG AT AAAACATT 19440 

ACTGAAATTA ATCATGCAGC 19500 

T ACGGTT CAT ATGCAAAACC 19560 

CAAAATGGAA AGCTAGAAAG 19620 

GCAGGCAATG GTGATAACTT 19680 

GAAACCCCAG ACACTCATAT 19740 

CTAATGGGCC AAC AATCTAT 19800 

ATTGGTCTAA TGTATTACAA 19060 

CAGTTGAATG CTGTTGTAGA 19920 

CTTGATTCCA TTGGTGATAG 19980 

TATGATCCAG ATGTTAGAAT 20040 

TGCTTTCCAC TGGGAGGTGT 20100 

GGTCAGGAAA ATGGATGGGA 20160 

GTTGGAAATA ATTTTGCCAT 20220 

TACTCCAACA TAGCGCTGTA 20280 

ATTTCTGATA ACCCAAACAC 20340 

GTGGACTGCT ACATTAACCT 20400 

CCATTTAACC ACCACCGCAA 20460 

CGCTATGTGC CCTTCCACAT 20520 

CTCCTGCCGG GCTCATACAC 20580 

CAGAGCTCCC TAGGAAATGA 20640 

TGCCTTTACG CCACCTTCTT 20700 

CTTAGAAACG ACACCAACGA 20760 

TACCCTATAC CCGCCAACGC 20820 

GCTTTCCGCG GCTGGGCCTT 20880 

GGCTACGACC CTTATTACAC 20940 

CTCAACCACA CCTTTAAGAA 21000 

AATGACCGCC TGCTTACCCC 21060 

TACAACGTTG CCCAGTGTAA 21120 

TACAACATTG GCTACCAGGG 21180 

TTCTTTAGAA ACTTCCAGCC 21240 

TACCAACAGG TGGGCATCCT 21300 

CCCACCATGC GCGAAGGACA 21360 

ACCGCAGTTG ACAGCATTAC 21420 

CCATTCTCCA GTAACTTTAT 21480 
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GTCCATGGGC 
GCTAGACATG 
TGAAGTCTTT 
CCTGCGCACG 
ACAGCTGCCG 
TGTGGGCCAT 
AAGCTCGCCT 
GCCTTTGCCT 
GACCAGCGAC 
ATTGCTTCTT 
CCCAACTCGG 
CCCCAAACTC 
ATGCTCAACA 
TTCCTGGAGC 
TCTTTTTGTC 
AAATG CTTTT 
GTTTAAAAAT 
CGATACTGGT 
AAGTTTTCAC 
ATCTTGAAGT 
CAGCACTGGA 
ATCAGATCCG 
TGCCTTCCCA 
AAAAGGTGAC 
TGCTTAAAAG 
GAAAACTGAT 
ATCTGCACCA 
TTCAGCGCGC 
ATCATAATGC 
CACAACGCGC 
TACGCCTGCA 
TGCAACCCGC 
TGGTCAGGCA 
A GCGCGCGCG 
TTCATCACCG 



GCACTCACAG 
ACTTTTGAGG 
GACGTGGTCC 
CCCTTCTCGG 
CCATGGGCTC 
ATTTTTTGGG 
GCGCCATAGT 
GGAACCCGCA 
TCAAGCAGGT 
CCCCCGACCG 
CCGCCTGTGG 
C CATGG ATC A 
GTCCCCAGGT 
GCCACTCGCC 
ACTTGAAAAA 
ATTTGTACAC 
CAAAGGGGTT 
GTTTAGTGCT 
TCCACAGGCT 
CGCAGTTGGG 
ACACTATCAG 
CGTCCAGGTC 
AAAAGGGCGC 
CGTGCCCGGT 
CCACCTGAGC 
TGGCCGGACA 
CATTTCGGCC 
GCTGCCCGTT 
TTCCGTGTAG 
AGCCCGTGGG 
GGAATCGCCC 
GGTGCTCCTC 
GTAGTTTGAA 
CAGCCTCCAT 
TAATTTC ACT 



ACCTGGGCCA 
TGGATCCCAT 
GTGTGCACCG 
CCGGCAACGC 
C AG TG AG C AG 
CACCTATGAC 
CAATACGGCC 
CTCAAAAACA 
TTACCAGTTT 
CTGTATAACG 
ACTATTCTGC 
CAACCCCACC 
ACAGCCCACC 
CTACTTCCGC 
CATGTAAAAA 
TCTCGGGTGA 
CTGCCGCGCA 
CCACTTAAAC 
GCGCACCATC 
GCCTCCGCCC 
CGCCGGGTGG 
CTCCGCGTTG 
GTGCCCAGGC 
CTGGGCGTTA 
CTTTGCGCCT 
GGCCGCGTCG 
CCACCGGTTC 
TTCGCTCGTC 
ACACTTAAGC 
CTCGTGATGC 
CATCATCGTC 
GTTCAGCCAG 
GTTCGCCTTT 
GCCCTTCTCC 
TTCCGCTTCG 



AAACCTTCTC 

GGACGAGCCC 

GCCGCACCGC 

C AC AAC AT AA 

GAACTGAAAG 

AAGCGCTTTC 

GGTCGCGAGA 

TGCTACCTCT 

GAGTACGAGT 

CTGGAAAAGT 

TGCATGTTTC 

ATGAACCTTA 

CTGCGTCGCA 

AGCCACAGTG 

TAATGTACTA 

TTATTTACCC 

TCGCTATGCG 

TCAGGCACAA 

ACCAACGCGT 

TGCGCGCGCG 

TGCACGCTGG 

CTCAGGGCGA 

TTTGAGTTGC 

GGATACAGCG 

TCAGAGAAGA 

TGCACGCAGC 

TTCACGATCT 

ACATCCATTT 

TCGCCTTCGA 

TTGTAGGTCA 

ACAAAGGTCT 

GTCTTGCATA 

AG AT C G TT AT 

CACGCAGACA 

CTGGGCTCTT 



TACGCCAACT 

ACCCTTCTTT 

GGCGTCATCG 

AGAAGCAAGC 

CCATTGTCAA 

CAGGCTTTGT 

CTGGGGGCGT 

TTGAGCCCTT 

CACTCCTGCG 

CCACCCAAAG 

TCCACGCCTT 

TTACCGGGGT 

ACCAGGAACA 

CGCAGATTAG 

GAGACACTTT 

CCACCCTTGC 

CCACTGGCAG 

CCATCCGCGG 

TTAGCAGGTC 

AGTTGCGATA 

CCAGCACGCT 

ACGGAGTCAA 

ACTCGCACCG 

CCTGCATAAA 

ACATGCCGCA 

ACCTTGCGTC 

TGGCCTTGCT 

CAATCACGTG 

TCTCAGCGCA 

CCTCTGCAAA 

TGTTGCTGGT 

CGGCCGCCAG 

CCACGTGGTA 

CGATCGGCAC 

CCTCTTCCTC 



CCGCCCACGC 

ATGTTTTGTT 

AAACCGTGTA 

AACATCAACA 

AGATCTTGGT 

TTCTCCACAC 

ACACTGGATG 

TGGCTTTTCT 

CCGTAGCGCC 

CGT AC AGGGG 

TGC C AACTGG 

ACCCAACTCC 

GCTCTACAGC 

GAGCGCCACT 

CAATAAAGGC 

CGTCTGCGCC 

GGACACGTTG 

CAGCTCGGTG 

GGGCGCCGAT 

CACAGGGTTG 

CTTGTCGGAG 

CTTTGGTAGC 

TAG TGG CATC 

AGCCTTGATC 

AGACTTGCCG 

GGTGTTGGAG 

AGACTGCTCC 

CTCCTTATTT 

GCGGTGCAGC 

CGACTGCAGG 

GAAGGTCAGC 

AGCTTCCACT 

CTTGTCCATC 

ACTCAGCGGG 

TTGCGTCCGC 



21540 

21600 

21660 

21720 

21780 

21840 

21900 

21960 

22020 

22080 

22140 

22200 

22260 

22320 

22380 

22440 

22500 

22560 

22620 

22680 

22740 

22800 

22860 
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22980 

23040 
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23160 
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23340 
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ATACCACGCG CCACTGGGTC GTCTTCATTC AGCCGCCGCA CTGTGCGCTT ACCTCCTTTG 23640 

CCATGCTTGA TTAGCACCGG TGGGTTGCTG AAACCCACCA TTTGTAGCGC CACATCTTCT 23700 

CTTTCTTCCT CGCTGTCCAC GATTACCTCT GGTGATGGCG GGCGCTCGGG CTTGGGAGAA 23760 

GGGCGCTTCT TTTTCTTCTT GGGCGCAATG GCCAAATC CG CCGCCGAGGT CGATGGCCGC 23 8 20 

GGGCTGGGTG TGCGCGGCAC CAGCGCGTCT TGTGATGAGT CTTCCTCGTC CTCGGACTCG 23880 

ATACGCCGCC TCATCCGCTT TTTTGGGGGC GCCCGGGGAG GCGGCGGCGA CGGGGACGGG 23940 

GACGACACGT CCTCCATGGT TGGGGGACGT CGCGCCGCAC CGCGTCCGCG CTCGGGGGTG 24000 

- GTTTCG CGCT GCTCCTCTTC CCGACTGGCC ATTTC CTTCT CCTATAGGCA GAAAAAGATC 24060 

ATGGAGTCAG TCGAGAAGAA GGACAGCCTA ACCGCCCCCT CTGAGTTCGC CACCACCGCC 24120 

TCCACCGATG CCGCCAACGC GCCTACCACC TTCCCCGTCG AGGCACCCCC GCTTGAGGAG 24180 

gaggaagtga TTATCGAGCA GGACCCAGGT TTTGTAAGCG AAGACGACGA GGACCGCTCA 2424 0 

; GTACCAACAG AGGATAAAAA GCAAGACCAG GACAACGCAG AGGCAAACGA GGAACAAGTC 24300 

GGGCGGGGGG ACGAAAGGCA TGGCGACTAC CTAGATGTGG GAGACGACGT GCTGTTGAAG 24360 

CATCTGCAGC GCCAGTGCGC CATTATCTGC GACGCGTTGC AAGAGCGCAG CGATGTGCCC 24420 

CTCGCCATAG CGGATGTCAG CCTTGCCTAC GAACGCCACC TATTCTCACC GCGGGTACCC 24480 

CCCAAACGCC AAGAAAACGG CACATGCGAG CCCAACCCGC GCCTCAACTT CTACCCCGTA 24540 

TTTGCCGTGC CAGAGGTGCT TGCCACCTAT CACAT CTTTT TCCAAAACTG CAAGATACCC 24600 

CTATCCTGCC GTGCCAACCG CAGCCGAGCG GACAAGCAGC TGGCCTTGCG GCAGGGCGCT 24660 

GTCATACCTG ATATCGCCTC GCTCAACGAA GTGCCAAAAA TCTTTGAGGG TCTTGGACGC 24720 

GACGAGAAGC GCGCGGCAAA CGCTCTGCAA CAGGAAAACA GCGAAAATGA AAGTCACTCT 24780 

GGAGTGTTGG TGGAACTCGA GGGTGACAAC GCGCGCCTAG CCGTACTAAA ACG CAGCATC 24 84 0 

GAGGTCACCC ACTTTGCCTA CCCGGCACTT AACCTACCCC CCAAGGTCAT GAGCACAGTC 24900 
ATGAGTGAGC TGATCGTGCG CCGTGCGCAG CCCCTGGAGA GGGATGCAAA TTTGCAAGAA 24960 
CAAACAGAGG AGGGCCTACC CGCAGTTGGC GACGAGCAGC TAGCGCGCTG GCTTCAAACG 25020 
-•CGCGAGCCTG CCGACTTGGA GGAGCGACGC AAACTAATGA TGGCCGCAGT GCTCGTTACC 25080 
■•'GTGGAGCTTG agtgcatgca gcggttcttt gctgacccgg AGATGCAGCG CAAGCTAGAG 2 514 0 
GAAACATTGC ACTACACCTT TCGACAGGGC TACGTACGCC AGGC CTGCAA GATCTCCAAC 25200 
GTGGAGCTCT GCAACCTGGT CTCCTACCTT GGAATTTTGC ACGAAAACCG CCTTGGGCAA 25260 
aacgtgcttc attccacgct caagggcgag gcgcgccgcg actacgt CCG CGACTGCGTT 2S32 0 
- TACTTATTTC TATGCTACAC CTGGCAGACG GCCATGGGCG TTTGGCAGCA GTGCTTGGAG 25380 
-GAGTGCAACC TCAAGGAGCT GCAGAAACTG CTAAAG C AAA ACTTGAAGGA CCTATGGACG 2 5440 
GCCTTCAACG AGCGCTCCGT GGCCGCGCAC CTGGCGGACA TCATTTTCCC CGAACGCCTG 25500 
CTTAAAACCC TGCAACAGGG TCTGCCAGAC TTCACCAGTC AAAGCATGTT GCAGAACTTT 25560 
AGGAACTTTA TCCTAGAGCG CTCAGGAATC TTGCCCGCCA CCTGCTGTGC ACTTCCTAG C 2 5620 
GACTTTGTGC CCATTAAGTA CCGCGAATGC CCTCCGCCGC TTTGGGGCCA CTGCTACCTT 25680 
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CTGCAGCTAG CCAACTACCT TGCCTACCAC TCTGACATAA TGGAAGACGT GAGCGGTGAC 25740 

GGTCTACTGG AGTGTCACTG TCGCTGCAAC CTATGCACCC CGCACCGCTC CCTGGTTTGC 25800 

AATTCGCAGC TGCTT AACGA AAGTCAAATT ATCGGTACCT TTGAGCTGCA GGGTCCCTCG 2 5860 

CCTGACGAAA AGTCCGCGGC TCCGGGGTTG AAACTCACTC CGGGGCTGTG GACGTCGGCT 25920 

TACCTTCGCA AATTTGTACC TGAGGACTAC CACGCCCACG AGATTAGGTT CTACGAAGAC 25980 

CAATCCCGCC CGCCAAATGC GGAGCTTACC GCCTGCGTCA TTACCCAGGG CCACATTCTT 26040 

GGCCAATTGC AAGCCATCAA CAAAGCCCGC CAAGAGTTTC TGCTACGAAA GGGACGGGGG 26100 

GTTT ACTTGG ACCCCCAGTC CGGCGAGGAG CTCAACCCAA TCCCCCCGCC GCCGCAGCCC 2616 0 

TATCAGCAGC AGCCGCGGGC CCTTGCTTCC CAGGATGGCA CCCAAAAAGA AGCTGCAGCT 26220 

GCCGCCGCCA CCCACGGACG AGGAGGAATA CTGGGACAGT CAGGCAGAGG AGGTTTTGGA 26280 

CGAGGAGGAG GAGGACATGA TGGAAGACTG GGAGAGCCTA GACGAGGAAG CTTCCGAGGT 26340 

CGAAGAGGTG TCAGACGAAA CACCGTCACC CTCGGTCGCA TTCCCCTCGC CGGCGCCCCA 26400 

GAAATCGGCA ACCGGTTCCA GCATGG CTAC AACCTCCGCT CCTCAGGCGC CGCCGGCACT 264 60 

GCCCGTTCGC CGACCCAACC GTAGATGGGA CACCACTGGA ACCAGGGCCG GTAAGTCCAA 26520 

GCAGCCGCCG CCGTTAGCCC AAGAGCAACA ACAGCGCCAA GGCTACCGCT CATGG CGCGG 26 58 0 

GCACAAGAAC GCCATAGTTG CTTGCTTGCA AGACTGTGGG GGCAACATCT CCTTCGCCCG 26640 

CCGCTTTCTT CTCTACCATC ACGGCGTGGC CTTCCCCCGT AACATCCTGC ATTACTACCG 26700 

TCATCTCTAC AGCCCATACT GCACCGGCGG CAGCGGCAGC GGCAGCAACA GCAGCGGCCA 26760 

CACAGAAGCA AAGGCGACCG GATAGCAAGA CTCTGACAAA GCCCAAGAAA TCCACAGCGG 26820 

CGGCAGCAGC AGGAGGAGGA GCGCTGCGTC TGGCGCCCAA CGAACCCGTA TCGACCCGCG 26880 

AGCTTAGAAA CAGGATTTTT CCCACTCTGT ATGCTATATT TCAACAGAGC AGGGGCCAAG 26940 

AACAAGAGCT GAAAATAAAA AACAGGTCTC TGCGATCCCT CACCCGCAGC TGCCTGTATC 27000 

ACAAAAGCGA AGATCAGCTT CGGCGCACGC TGGAAGACGC GGAGGCTCTC TTCAGTAAAT 27060 

ACTGCGCGCT GACTCTTAAG GACTAGTTTC GCGCCCTTTC TCAAATTTAA GCGCGAAAAC 27120 

TACGTCATCT CCAGCGGCCA CACCCGGCGC CAGCACCTGT CGTCAGCGCC ATTATGAGCA 27180 

AGGAAATTCC CACGCCCTAC ATGTGGAGTT ACCAGCCACA AATGGGACTT GCGGCTGGAG 27240 

CTGCCCAAGA CTACTCAACC CGAAT AAACT ACATGAGCGC GGGACCCCAC ATGATATCCC 273 00 

GGGTCAACGG AATCCGCGCC CACCGAAACC GAATTCTCTT GGAACAGGCG GCTATTACCA 27360 

CCACACCTCG TAATAACCTT AATCCCCGTA GTTGGCCCGC TGCCCTGGTG TACCAGGAAA 27420 

GTCCCGCTCC CACCACTGTG GTACTTCCCA GAGACGCCCA GGCCGAAGTT CAGATGACTA 27480 

ACTCAGGGGC GCAGCTTGCG GGCGGCTTTC GTCACAGGGT GCGGTCGCCC GGGCAGGGTA 27540 

TAACTCACCT GACAATCAGA GGGCGAGGTA TTCAGCTCAA CGACGAGTCG GTGAGCTCCT 27600 

CGCTTGGTCT CCGTCCGGAC GGGACATTTC AGATCGGCGG CGCCGGCCGT CCTTCATTCA 27660 

CGCCTCGTCA GGCAATCC7A ACTCTGCAGA CCTCGTCCTC TGAGCCGCGC TCTGGAGGCA 27720 

TTGGAACTCT GCAATTTATT GAGGAGTTTG TGCCATCGGT CTACTTTAAC CCCTTCTCGG 27780 
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GACCTCCC3G CCACTATCCG GATCAATTTA TTCCTAACTT TGACGCGGTA AAGGACTCGG 27840 

CGGACGGCTA CGACTGAATG TTAAGTGGAG AGGCAGAGCA ACTGCGCCTG AAACACCTGG 27900 

TCCAC7G7CG CCGCCACAAG TGCTTTGCCC GCGACTCCGG TGAGTTTTGC TACTTTGAAT 27960 

TGCCCGAGGA TCATATCGAG GGCCCGGCGC ACGGCGTCCG GCTTACCGCC CAGGGAGAGC 28020 

TTGCCCGTAG CCTGATTCGG GAGTTTACCC AGCGCCCCCT GCTAGTTGAG CGGGACAGGG 28080 

...GACCCTGTGT TCTCACTGTG ATTTGCAACT GTCCTAACCT TGGATTACAT CAAGATCTTT 2814 0 

GTTGCCATCT CTGTGCTGAG TATAATAAAT ACAGAAATTA AAATATACTG GGGCTCCTAT 28200 

:„CGCCATCCTG TAAACGCCAC CGTCTTCACC CGGCCAAGCA AACCAAGGCG AACCTTACCT 28260 

GGTACTT7TA ACATCTCTCC CTCTGTGATT TACAACAGTT TCAACCCAGA CGGAGTGAGT 28320 

CTACGAGAGA ACCTCTCCGA GCTCAGCTAC TCCATCAGAA AAAACAC C AC CCTCCTTACC 2 8380 

- TGCCGGGAAC GTACGAGTGC GTCACCGGCC GCTGCACCAC ACCTACCGCC TGACCGTAAA 28440 

> CCAGACTTTT TCCGGACAGA CCTCAATAAC TCTGTTTACC AGAACAGGAG GTGAGCTTAG 28500 

-AAAACCCTTA GGGTATTAGG CCAAAGGCGC AGCTACTGTG GGGTTT ATG A ACAATTCAAG 2 8 560 
CAACTCTACG GGCTATTCTA ATTCAGGTTT CTCT AGAATC GGGGTTGGGG TTATTCTCTG 286 20 
TCTTGTGATT CTCTTTATTC TTATACTAAC GCTTCTCTGC CT AAGGCTCG CCGCCTGCTG 2 86 80 
TGTGCACATT TGCATTTATT GTCAGCTTTT TAAACGCTGG GGTCGCCACC CAAGATGATT ' 28740 

AGGTACATAA TCCTAGGTTT ACTCACCCTT GCGTCAGCCC ACGGTACCAC C CAAAAGG TG 288 00 
GATTTTAAGG AGCCAGCCTG TAATGTTACA TTCGCAGCTG AAGCTAATGA GTGCACCACT 28860 
CTTATAAAAT GCACCACAGA ACATGAAAAG CTGCTTATTC GCCACAAAAA CAAAATTGGC 28920 
AAGTATG CTG TTTATGCTAT TTGGCAGCCA GGTGACACTA CAGAGTATAA TGTTACAGTT 2 8 980 
TTCCAGGGTA AAAGTCATAA AACTTTTATG TATACTTTTC CATTTT ATG A AATGTGCGAC 2 904 0 
ATTAC*- .-v * -o T ACATGAGCAA ACAGTATAAG TTGTGGCCCC CACAAAATTG TGTGGAAAAC 29100 
ACTGGC.-.CTT TCTGCTGCAC TGCTATGCTA ATTACAGTGC TCGCTTTGGT CTGTACCCTA 2 916 0 
CTCTATA7TA AATACAAAAG CAGACGCAGC TTTATTGAGG AAAAGAAAAT GCCTTAATTT 29220 
:ACTAAG7TAC AAAGCTAATG TCACCACTAA CTGCTTTACT CGCTGCTTGC AAAACAAATT 29280 
■■•CAAAAAGTTA GCATTATAAT TAGAATAGGA TTTAAACCCC C CGGTCATTT CCTGCTCAAT 2 9340 
ACCATTCCCC TGAACAATTG ACTCTATGTG G GATATG CTC CAGCGCTACA ACCTTGAAGT 2 94 00 
CAGGCTTCCT GGATGTCAGC ATCTGACTTT GGCCAGCACC TGTCCCGCGG ATTTGTTCCA 29460 
•GTCCAACTAC AGCGACCCAC CCTAACAGAG ATGACCAACA CAACCAACGC GGCCGCCGCT 29520 
ACCGGACTTA CATCTACCAC AAATACACCC CAAGTTTCTG CCTTTGTCAA TAACTGGGAT 29580 
. AACTTGGGCA TGTGGTGGTT CTCCATAG CG CTTATGTTTG TATGCCTTAT TATTATGTGG 2 9640 
CTCATC7GCT GCCTAAAGCG CAAACGCGCC CGACCACCCA TCTATAGTCC CATCATTGTG 29700 
CTACACCCAA ACAATGATGG AATCCATAGA TTGGACGGAC TGAAACACAT GTTCTTTTCT 29760 
CTTACAGTAT GATTAAATGA GACATGATTC CTCGAGTTTT TATATTACTG ACCCTTGTTG .29820 
CGCTTTTTTG TGCGTGCTCC ACATTGGCTG CGGTTTCTCA CATCGAAGTA GACTGCATTC 



29880 




WO 98/17783 



PCT/US97/1 954 1 



• • 

T^Tr, rTTTarr:r,AT TTnTfarrrT r* n. 



CAGCCTTCAC 


AGTCTATTTG 


CTTTACGGAT 


TTGTCACCCT 


CACGCTCATC 


TGCAGCCTCA 


29940 


TCACTGTGGT 


CATCGCCTTT 


ATCCAGTGCA 


TTGACTGGGT 


CTGTGTGCGC 


TTTGCATATC 


30000 


TCAGACACCA 


TCCCCAGTAC 


AGGGACAGGA 


CTATAGCTGA 


GCTTCTTAGA 


ATTCTTTAAT 


30060 


TATGAAATTT 


ACTGTGACTT 


TTCTGCTGAT 


TATTTGCACC 


CTATCTGCGT 


TTTGTTCCCC 


30120 


GACCTCCAAG 


CCTCAAAGAC 


ATATATCATG 


CAGATTCACT 


CGTATATGGA 


ATATTCCAAG 


30180 


TTGCTACAAT 


GAAAAAA GCG 


ATCTTTCCGA 


AGCCTGGTTA 


TATGCAATCA 


TCTCTGTTAT 


30240 


GGTGTTCTGC 


AGTACCATCT 


TAGCCCTAGC 


TATATATCCC 


TACCTTGACA 


TTGGCTGGAA 


30300 


ACGAATAGAT 


GCCATGAACC 


ACCCAACTTT 


CCCCGCGCCC 


GCTATGCTTC 


CACTGCAACA 


30360 


AGTTGTTGCC 


GGCGGCTTTG 


TCCCAGCCAA 


TCAGCCTCGC 


CCCACTTCTC 


CCACCCCCAC 


30420 


TGAAATCAGC 


TACTTTAATC 


TAACAGGAGG 


AGATGACTGA 


CACCCTAGAT 


CTAGAAATGG 


30480 


ACGGAATTAT 


TACAGAGCAG 


CGCCTGCTAG 


AAAGACGCAG 


GGCAGCGGCC 


GAGCAACAGC 


30540 


GCATGAATCA 


AGAGCTCCAA 


GACATGGTTA 


ACTTGCACCA 


GTGCAAAAGG 


GGTATCTTTT 


30600 


GTCTGGTAAA 


GCAGGCCAAA 


GTCACCTACG 


ACAGTAATAC 


CACCGGACAC 


CGCCTTAGCT 


30660 


ACAAGTTGCC 


AACCAAGCGT 


CAGAAATT GG 


TGGTCATGGT 


GGGAGAAAAG 


CCCATTACCA 


30720 


TAACTCAGCA 


CTCGGTAGAA 


ACCGAAGGCT 


GCATTCACTC 


AC CTTGTCAA 


GGACCTGAGG 


30780 


ATCTCTGCAC 


CCTTATTAAG 


ACCCTGTGCG 


GTCTCAAAGA 


TCTTATTCCC 


TTTAACTAAT 


30840 


AAAAAAAAAT 


AATAAAGCAT 


CACTTACTTA 


AAATCAGTTA 


GCAAATTTCT 


GTCCAGTTTA 


30900 


TTCAGCAGCA 


CCTCCTTGCC 


CTCCTCCCAG 


CTCTGGTATT 


GCAGCTTCCT 


CCTGGCTGCA 


30960 


AACTTTCTCC 


ACAATCTAAA 


TGGAATGTCA 


GTTTCCTCCT 


GTTCCTGTCC 


ATCCGCACCC 


31020 


ACTATCTTCA 


TGTTGTTGCA 


GATGAAGCGC 


GCAAGACCGT 


CTGAAGATAC 


CTTCAACCCC 


31080 


GTGTATCCAT 


ATGACACGGA 


AACCGGTCCT 


CCAACTGTGC 


CTTTTCTTAC 


TCCTCCCTTT 


31140 


GTATCCCCCA 


ATGGGTTTCA 


AGAGAGTCCC 


CCTGGGGTAC 


TCTCTTTGCG 


CCTATCCGAA 


31200 


CCTCTAGTTA 


CCTCCAATGG 


CATGCTT GCG 


CTCAAAATGG 


GCAACGGCCT 


ctctctggac 


31260 


GAGGCCGGCA 


ACCTTACCTC 


C CAAAATGT A 


ACCACTGTGA 


GCCCACCTCT 


CAAAAAAACC 


31320 


AAGTCAAACA 


TAAACCTGGA 


AATATCTGCA 


CCCCTCACAG 


TTACCTCAGA 


AGCCCTAACT 


31380 


GTGGCTGCCG 


CCGCACCTCT 


AATGGTCGCG 


GGCAACACAC 


TCACCATGCA 


ATCACAGGCC 


31440 


CCGCTAACCG 


TGCACGACTC 


CAAACTTAGC 


ATTGCCACCC 


AAGGACCCCT 


CACAGTGTCA 


31500 


GAAGGAAAGC 


TAGCCCTGCA 


AACATCAGGC 


CCCCTCACCA 


CCACCGATAG 


CAGTACCCTT 


31560 


ACTATCACTG 


CCTCACCCCC 


TCTAACTACT 


GCCACTGGTA 


GCTTGGGCAT 


TGACTTGAAA 


31620 


GAGCCCATTT 


AT AC AC AAAA 


TGGAAAACTA 


GGACTAAAGT 


ACGGGGCTCC 


TTTGCATGTA 


31680 


ACAGACGACC 


TAAACACTTT 


GACCGTAGCA 


ACTGGTCCAG 


GTGTGACTAT 


T AATAAT ACT 


31740 


TCCTTGCAAA 


CTAAAGTTAC 


TGGAGCCTTG 


GGTTTTGATT 


CACAAGGCAA 


TATGCAACTT 


31800 


AATGTAGCAG 


GAGGACTAAG 


GATTGATTCT 


CAAAACAGAC 


GCCTTATACT 


TGATGTTAGT 


31860 


TATCCGTTTG 


ATGCTCAAAA 


CCAACTAAAT 


CTAAGACTAG 


GACAGGGCCC 


T CTTTTT AT A 


31920 


AACTCAGCCC 


ACAACTTGGA 


TATTAACTAC 


AACAAAGGCC 


TTTACTTGTT 


TACAGCTTCA 


31980 
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AACAA7TCCA AAAAGCTTGA GGTTAACCTA AGCACTGCCA AGGGGTTGAT GTTTGACGCT 32040 

AC AG CCATAG CCATTAATGC AGGAGATGGG CTTGAATTTG GTT CACCTAA TGCACCAAAC 3 2100 

ACAAATCCCC TCAAAACAAA AATTGGCCAT GGCCTAGAAT TTGATTCAAA CAAGGCTATG 32160 

GTTCCTAAAC TAGGAACTGG CCTTAGTTTT GACAGCACAG GTGCCATTAC AGTAGGAAAC 32220 
AAAAAi AATG ATAAGCTAAC TTTGTGGACC ACACCAGCTC CATCTCCTAA CTGTAGACTA 32280 
AATGCAGAGA AAGATGCTAA ACTCACTTTG GTCTTAACAA AATGTGGCAG TCAAATACTT 32340 
GCTACAGTTT CAGTTTTGGC TGTTAAAGGC AGTTTGGCTC CAATATCTGG AACAGTTCAA 32400 
AGTGCTCATC TTATTATAAG ATTTGACGAA AATGGAGTGC TACTAAACAA TTCCTTCCTG 32460 
GACCCAGAAT ATTGGAACTT TAGAAATGGA GATCTTACTG AAGGCACAGC. CTATACAAAC 3 2 520 
GCTGTTGGAT TTATGCCTAA CCTATCAGCT TATCCAAAAT CTCACGGTAA AACTG CCAAA 32 58 0 
AGTAACATTG TCAGTCAAGT TTACTTAAAC GGAGACAAAA CTAAAC CTGT AACACTAACC 3 2 640 
ATTACACTAA ACGGTACACA GGAAACAGGA GACACAACTC CAAGTGCATA CTCTATGTCA 32700 
TTTTCATGGG ACTGGTCTGG CCACAACTAC ATTAATGAAA TATTTGCCAC ATCCTCTTAC 32760 
ACTTTTTCAT ACATTGCCCA AGAATAAAGA ATCGTTTGTG TTATGTTTCA ACGTGTTTAT 32820 
TTTTCAATTG CAGAAAATTT CAAGTCATTT TTCATTCAGT AGTATAGCCC CACCACCACA 32880 
TAGCTTATAC AGATCACCGT ACCTTAATCA AACTCACAGA ACCCTAGTAT TCAACCTGCC 32940 
ACCTCCCTCC CAACACACAG AGTACACAGT CCTTTCTCCC CGGCTGGCCT TAAAAAG CAT 3 3 000 
CATATCATGG GTAACAGACA TATTCTTAGG TGTTATATTC CACACGGTTT CCTGTCGAGC 33060 
CAAACGCTCA TCAGTGATAT TAATAAACTC CCCGGGCAGC TCACTTAAGT TCATGTCGCT 33120 
GTCCAGCTGC TGAGCCACAG GCTGCTGTCC AACTTGCGGT TGCTTAACGG GCGGCGAAGG 33180 
AGAAGTCCAC GCCTACATGG GGGTAGAGTC ATAATCGTGC ATCAGGATAG GGCGGTGGTG 33240 
CTGCAGCAGC GCGCGAATAA ACTGCTGCCG CCGCCGCTCC GTCCTGCAGG AATACAACAT 33300 
GGCAGTGGTC TC CTCAGCGA TGATTCGCAC CGCCCGCAGC ATAAGGCGCC TTGTCCTCCG 3 3 360 
GGCACAGCAG CGCACCCTGA TCTCACTTAA ATCAGCACAG T AACTG CAGC ACAGCACCAC 33 420 
AATATTGTTC AAAATCCCAC AGTGCAAGGC GCTGTATCCA AAGCTCATGG CGGGGACCAC 33480 
AGAACCCACG TGGCCATCAT ACCACAAGCG CAGGTAGATT AAGTGGCGAC CCCTCATAAA 33540 
CACGCTGGAC ATAAACATTA CCTCTTTTGG CATGTTGTAA TTCACCACCT CCCGGTACCA 33600 
TATAAACCTC TGATTAAACA TGGCGCCATC CACCACCATC CTAAACCAGC TGG CCAAAAC 3 3 660 
CTGCCCGCCG GCTATACACT GCAGGGAACC GGGACTGGAA CAATGACAGT GGAGAGCCCA 33720 
GGACTCGTAA CCATGGATCA TCATGCTCGT CATGATATCA ATGTTGGCAC AACACAGGCA 33780 
CACGTGCATA CACTTCCTCA GGATTACAAG CTCCTCCCGC GTTAGAACCA TATCCCAGGG 33840 
AACAACCCAT T.CCTGAATCA GCGTAAATCC CACACTGCAG GGAAGACCTC GCACGTAACT 33900 
CACGTTGTGC ATTGTCAAAG TGTTACATTC GGGCAGCAGC GGATGATCCT CCAGTATGGT 33960 
AGCGCGGGTT TCTGTCTCAA AAGGAGGTAG ACGATCCCTA CTGT ACGGAG TGCGCCGAGA 3 4 020 
CAACCGAGAT CGTGTTGGTC GTAGTG TC AT GCCAAATGGA ACGCCGGACG TAGTCATATT 34080 
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TCCTGAAGCA AAACCAGGTG CGGGCGTGAC AAACAGATCT GCGTCTCCGG TCTCGCCGCT 34140 

TAGATCGCTC TGTGTAGTAG TTGTAGTATA TCCACTCTCT CAAAGCATCC AGGCGCCCCC 34200 

TGGCTTCGGG TTCTATGTAA ACTCCTTCAT GCGCCGCTGC CCTGATAACA TCCACCACCG 34260 

CAGAATAAGC CACACCCAGC CAACCTACAC ATTCGTTCTG CGAGTCACAC ACGGGAGGAG 34320 

CGGGAAGAGC TGGAAGAACC ATGTTTTTTT TTTTATTCCA AAAGATTATC CAAAACCTCA 34380 

AAATGAAGAT CTATTAAGTG AACGCGCTCC CCTCCGGTGG CGTGGTCAAA CTCTACAGCC 34440 

AAAGAACAGA T AATGGCATT TGTAAGATGT TGCACAATGG CTTCCAAAAG GCAAACGGCC 34 500 

CTCACGTCCA AGTGGACGT A AAGGCTAAAC CCTTCAGGGT GAATCTCCTC TATAAACATT 34 560 

CCAGCACCTT CAACCATGCC CAAATAATTC TCATCTCGCC ACCTTCTCAA TATATCTCTA 34620 

AGCAAATCCC GAATATTAAG TCCGGCCATT GTAAAAATCT GCTCCAGAGC GCCCTCCACC 34680 

TTCAGCCTCA AGCAGCGAAT CATGATTGCA AAAATTCAGG TTCCTCACAG AC CTGTATAA 34 74 0 

GATTCAAAAG CGGAACATTA ACAAAAATAC CGCGATCCCG TAGGTCCCTT CGCAGGGCCA 34800 

GCTGAACATA ATCGTGCAGG TCTGCACGGA CCAGCGCGGC CACTTCCCCG CCAGGAACCT 34860 

TGACAAAAGA ACCCACACTG ATTATGACAC GCATACTCGG AGCTATGCTA ACCAGCGTAG 34920 

CCCCGATGTA AGCTTTGTTG CATGGGCGGC GATATAAAAT GCAAGGTGCT GCTCAAAAAA 34980 

TCAGGCAAAG CCTCGCGCAA AAAAGAAAGC ACATCGTAGT CATGCTCATG CAGATAAAGG 35040 

CAGGTAAGCT CCGGAACCAC CACAGAAAAA GACACCATTT TTCTCTCAAA CATGTCTGCG 35100 

GGTTTCTGCA T AAACACAAA ATAAAATAAC AAAAAAACAT TTAAACATTA GAAGCCTGTC 3S160 

TTACAACAGG AAAAACAACC CTTATAAGCA TAAGACGGAC TACGGCCATG CCGGCGTGAC 35220 

CGTAAAAAAA CTGGTCACCG TGATTAAAAA GCACCACCGA CAGCTCCTCG GTCATGTCCG 35280 

GAGTCATAAT GTAAGACTCG GTAAACACAT CAGGTTGATT CATCGGTCAG TGCTAAAAAG 35340 

CGACCGAAAT AGCCCGGGGG AAT AC AT ACC CGCAGGCGTA GAGACAACAT TACAGCCCCC 3 5400 

ATAGGAGGTA TAACAAAATT AATAGGAGAG AAAAACACAT AAACACCTGA AAAACCCTCC 35460 

TGCCTAGGCA AAATAGCACC CTCCCGCTCC AGAACAACAT ACAGCGCTTC ACAGCGGCAG 35520 

CCTAACAGTC AGCCTTACCA GTAAAAAAGA AAACCTATTA AAAAAACACC ACT CG AC ACG 3 5580 

GCACCAGCTC AATCAGTCAC AGTGT AAAAA AGGGCCAAGT GCAGAGCGAG TATATATAGG 3 5640 

ACTAAAAAAT GACGTAACGG TTAAAGTCCA CAAAAAACAC CCAGAAAACC GCACGCGAAC 35700 

CTACGCCCAG AAACGAAAGC CAAAAAACCC ACAACTTCCT CAAATCGTCA CTTCCGTTTT 35760 

CCCACGTTAC GTAACTTCCC ATTTTAAGAA AACTACAATT CCCAACACAT ACAAGTTACT 35820 

CCGCCCTAAA ACCTACGTCA CCCGCCCCGT TCCCACGCCC CGCGCCACGT CACAAACTCC 35880 

ACCCCCTCAT TATCATATTG GCTTCAATCC AAAATAAGGT ATATTATTGA TGATG 35935 

( 2 ) INFORMATION FOR SEQ ID NO : 2 : 

!i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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vii) MOLECULE TYFE : ocher nucleic acid 

(A) DESCRIPTION: /desc = "DNA" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

TTCATTTT AT GTTTCAGGTT CAGGG 

(2) INFORMATION FOR SEQ ID NO : 3 : 



(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = “DNA" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 

TTACCGCCAC ACTCGCAGGG 

(2) INFORMATION FOR SEQ ID NO: 4: 

! i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 34303 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
. (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "DNA" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 

TTATTTTGGA TTGAAGCCAA TATGATAATG AGGGGGTGGA GTTTGTGACG TGGCGCGGGG 
CGTGGGAACG GGGCGGGTGA CGTAGTAGTG TGGCGGAAGT GTGATGTTGC AAGTGTGGCG 
GAACACATGT AAG CGACGGA TGTGGCAAAA GTGACGTTTT TGGTGTGCGC CGGATCCACA 
GGACGGGTGT GGTCGCCATG ATCGCGTAGT CGATAGTGGC TCCAAGTAGC GAAGCGAGCA 
GGACTGGGCG GCGGCCAAAG CGGTCGGACA GTGCTCCGAG AACGGGTGCG CATAGAAATT 
GCATCAACGC ATATAGCGCT AGCAGCACGC CATAGTGACT GGCGATGCTG TCGGAATGGA 
CGATATCCCG CAAGAGGCCC GGCAGTACCG GCATAACCAA GCCTATGCCT ACAGCATCCA 
GGGTGACGGT GCCGAGGATG ACGATGAGCG CATTGTTAGA TTT CATACAC GGTGCCTGAC 
TG CGTTAG C A ATTTAACTGT GATAAACTAC CGCATTAAAG CTTATCGATG ATAAGCTGTC 
AAACATGAGA ATTCTTGAAG ACGAAAGGGC CTCGTGATAC G C CTATTTTT ATAGGTTAAT 
GTCATGATAA TAATGGTTTC TTAGACGTCA GGTGGCACTT TTCGGGGAAA TGTGCGCGGA 
ACCCCTATTT GTTTATTTTT CTAAATACAT TCAAATATGT ATCCGCTCAT GAGACAATAA 
CCCTGATAAA TGCTTCAATA ATATTGAAAA AGGAAGAGTA TGAGTATTCA ACATTTCCGT 
GTCGCCCTTA TTCCCTTTTT TGCGGCATTT TGCCTTCCTG TTTTTGCTCA CCCAGAAACG 
CTGGTGAAAG TAAAAGATGC TGAAGATCAG TTGGGTGCAC GAGTGGGTTA CATCGAACTG 
GATCTCAACA GCGGTAAGAT CCTTGAGAGT TTTCGCCCCG AAGAACGTTT TCCAATGATG 
AGCACTTTTA AAGTTCTGCT ATGTGGCGCG GTATTATCCC GTGTTGACGC CGGGCAAGAG 
CAACTCGGTC G CCG CAT A C A CTATTCTCAG AATGACTTGG TTGAGTACTC ACCAGTCACA 



25 



20 



60 
120 
180 
240 
300 
360 
42 0 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
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CATAACCATG 1140 


AGTGATAACA 


CTGCGGCCAA 


CTTACTTCTG 


ACAACGATCG 


GAGGACCGAA 


GGAGCTAACC 


1200 


GCTTTTTTGC 


ACAACATGGG 


GGATCATGTA 


ACTCGCCTTG 


ATCGTTGGGA 


ACCGGAGCTG 


1260 


AATGAAGCCA 


TACCAAACGA 


CGAGCGTGAC 


ACCACGATGC 


CTGCAGCAAT 


GGCAACAACG 


1320 


TTGCGCAAAC 


TATTAACTGG 


CGAACTACTT 


ACTCTAGCTT 


CCCGGCAACA 


ATTAAT AG AC 


1380 


TGGATGGAGG 


CGGATAAAGT 


TGCAGGACCA 


CTTCTGCGCT 


CGGCCCTTCC 


GGCTGGCTGG 


1440 


TTTATTGCTG 


ATAAATCTGG 


AGCCGGTGAG 


CGTGGGTCTC 


GCGGTATCAT 


TGCAGCACTG 


1500 


GGGCCAGATG 


GTAAGCCCTC 


CCGTATCGTA 


GTTATCTACA 


CGACGGGGAG 


TCAGGCAACT 


1560 


ATGGATGAAC 


GAAATAGACA 


GATCGCTGAG 


ATAGGTGCCT 


CACTGATTAA 


GCATTGGTAA 


1620 


CTGTCAGACC 


AAGTTTACTC 


ATATATACTT 


TAGATTGATT 


TAAAACTTCA 


TTTTTAATTT 


1680 


AAAAGGATCT 


AGGTGAAGAT 


CCTTTTTGAT 


AATCTCATGA 


CCAAAATCCC 


TTAACGTGAG 


1740 


TTTTCGTTCC 


ACTGAGCGTC 


AGACCCCGTA 


GAAAAGATCA 


AAGGATCTTC 


TTGAGATCCT 


1800 


TTTTTTCTGC 


GCGTAATCTG 


CTGCTTGCAA 


ACAAAAAAAC 


CACCGCTACC 


AGCGGTGGTT 


1860 


TGTTTGCCGG 


ATCAAGAGCT 


ACCAACTCTT 


tttccgaagg 


TAACTGGCTT 


CAGCAGAGCG 


1920 


CAGATACCAA 


ATACTGTCCT 


TCTAGTGTAG 


CCGTAGTTAG 


GCCACCACTT 


CAAGAACTCT 


1980 


GTAGCACCGC 


CTACATACCT 


CGCTCTGCTA 


ATCCTGTTAC 


CAGTGGCT GC 


TGCCAGTGGC 


2040 


GATAAGTCGT 


GTCTTACCGG 


GTTGGACTCA 


AGACGATAGT 


TACCGGATAA 


GGCGCAGCGG 


2100 


TCGGGCTGAA 


CGGGGGGTTC 


GTGCACACAG 


CCCAGCTTGG 


AGCGAACGAC 


CTACACCGAA 


2160 


CTGAGATACC 


TACAGCGTGA 


GCATTGAGAA 


AGCGCCACGC 


TTCC CGAAGG 


GAGAAAGGCG 


2220 


GACAGGTATC 


CGGTAAGCGG 


CAGGGTCGGA 


ACAGGAGAGC 


GCACGAGGGA 


GCTTCCAGGG 


2280 


GGAAACGCCT 


GGTATCTTTA 


TAGTCCTGTC 


GGGTTTCGCC 


ACCTCTGACT 


TGAGCGTCGA 


2340 


TTTTTGTGAT 


GCTCGTCAGG 


GGGGCGGAGC 


CTATGGAAAA 


ACGCCAGCAA 


CGCGGCCTTT 


2400 


TTACGGTTCC 


TGGCCTTTTG 


CTGGCCTTTT 


GCTCACATGT 


TCTTTCCTGC 


GTTATCCCCT 


2460 


GATTCTGTGG 


ATAACCGTAT 


TACCGCCTTT 


GAGTGAGCTG 


ATACCGCTCG 


CCGCAGCCGA 


2 52 0 


ACGACCGAGC 


GCAGCGAGTC 


AGTGAGCGAG 


GAAGCGGAAG 


AGCGCCTGAT 


GCGGTATTTT 


2580 


CTCCTTACGC 


ATCTGTGCGG 


TATTTCACAC 


CGCATATGGT 


GCACTCTCAG 


TACAATCTGC 


2640 


TCTGATGCCG 


CATAGTTAAG 


CCAGTATACA 


CTCCGCTATC 


GCTACGTGAC 


TGGGTCATGG 


2700 


CTGCGCCCCG 


ACACCCGCCA 


ACACCCGCTG 


ACGCGCCCTG 


ACGGGCTTGT 


CTGCTCCCGG 


2760 


CATCCGCTTA 


CAGACAAGCT 


GTGACCGTCT 


CCGGGAGCTG 


CATGTGTCAG 


AGGTTTTCAC 


2820 


CGTCATCACC 


GAAACGCGCG 


AGGCAGTCTA 


GACAATAGTA 


GTACGGATAG 


CTGTGACTCC 


2880 


GGTCCTTCTA 


ACACACCTCC 


TGAGATACAC 


CCGGTGGTCC 


CGCTGTGCCC 


CATTAAACCA 


2940 


GTTGCCGTGA 


GAGTTGGTGG 


GCGTCGCCAG 


GCTGTGGAAT 


GTATCGAGGA 


CTTGCTTAAC 


3 00 0 


GAGCCTGGGC 


AACCTTTGGA 


CTTGAGCTGT 


AAACGCCCCA 


GGCCATAAGG 


TGTAAACCTG 


3060 


TGATTGCGTG 


TGTGGTTAAC 


GCCTTTGTTT 


GCTGAATGAG 


TTGATGTAAG 


TTTAATAAAG 


3120 


GGTGAGATAA 


TGTTTAACTT 


GCATGGCGTG 


TTAAATGGGG 


CGGGGCTTAA 


AGGGTATATA 


3180 
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ATGCGCCGTG GGCTAATCTT GGTTACATCT GACCTCATGG AGGCTTGGGA GTGTTTGGAA 3240 

GATTTTTCTG CTGTGCGTAA CTTGCTGGAA CAGAGCTCTA ACAGTACCTC TTGGTTTTGG 3300 

AGGTTTCTGT GGGGCTCATC CCAGGCAAAG TTAGTCTGCA GAATTAAGGA GGATTAGAAG 3360 

TGGGAATTTG AAGAGCTTTT GAAATCCTGT GGTGAGCTGT TTGATTCTTT GAATCTGGGT 3420 

CACCAGGCGC TTTTCCAAGA GAAGGTCATC AAGACTTTGG ATTTTTCCAC ACCGGGGCGC 3480 

GCTGCGGCTG CTGTTGCTTT TTTGAGTTTT ATAAAGG AT A AATGGAGCGA AGAAACCCAT 35 40 

CTGAGCGGGG GGTACCTGCT GGATTTTCTG GCCATGCATC TGTGGAGAGC GGTTGTGAGA 3600 

. CACAAGAATC GCCTGCTACT GTTGTCTTCC GTCCGCCCGG CGATAATACC GACGGAGGAG 3660 

CAGCAGCAGC AGCAGGAGGA AGCCAGGCGG CGGCGGCAGG AGCAGAGCCC ATGGAACCCG 3720 

AGAGCCGGCC TGGACCCTCG GGAATGAATG TTGT ACAGGT GGCTGAACTG TATCCAGAAC 3 78 0 

TGAGACGCAT TTTGACAATT ACAGAGGATG GGCAGGGGCT AAAGGGGGTA AAGAGGGAGC 3840 

GGGGGGCTTG TGAGGCTACA GAGGAGGCTA GGAATCTAGC TTTTAGCTTA ATGACCAGAC 3900 

ACCGTCCTGA G TG T ATT AC T TTTCAACAGA TCAAGGATAA TTGCGCTAAT GAGCTTGATC 3 96 0 

TGCTGGCGCA GAAGTATTCC ATAGAGCAGC TGACCACTTA CTGGCTGCAG CCAGGGGATG 4020 

ATTTTGAGGA GGCTATTAGG GTATATGCAA AGGTGGCACT TAGGCCAGAT TGCAAGTACA 4080 

AGATCAGCAA ACTTGTAAAT ATCAGGAATT GTTGCTACAT TTCTGGGAAC GGGGCCGAGG 4140 

TGGAGATAGA TACGGAGGAT AGGGTGGCCT TTAGATGTAG CATGATAAAT ATGTGGCCGG 4200 

GGGTGCTTGG CATGGACGGG GTGGTTATTA TGAATGTAAG GTTTACTGGC CCCAATTTTA 4260 

GCGGTACGGT TTTCCTGGCC AATACCAACC TTATCCT ACA CGGTGTAAGC TTCTATGGGT 4 3 20 

TTAACAATAC CTGTGTGGAA GCCTGGACCG ATGTAAGGGT TCGGGGCTGT G CCTTTTACT 4 380 

GCTGCTGGAA GGGGGTGGTG TGTCGCCCCA AAAGCAGGGC TTCAATTAAG AAATGCCTCT 4440 

TTGAAAGGTG TACCTTGGGT ATCCTGTCTG AGGGTAACTC. CAGGGTGCGC CACAATGTGG 4500 

CCTCCGACTG TGGTTGCTTC ATGCTAGTGA AAAGCGTGGC TGTGATTAAG CATAACATGG 4560 

TATGTGGCAA CTGCGAGGAC AGGGCCTCTC AGATGCTGAC CTGCTCGGAC GGCAACTGTC 4620 

ACCTGCTGAA GACCATTCAC GTAGCCAGCC ACTCTCGCAA GGCCTGGCCA GTGTTTGAGC 4680 

ATAACATACT GACCCGCTGT TCCTTGCATT TGGGTAACAG GAGGGGGGTG TTCCTACCTT 4740 

ACCAATGCAA TTTGAGTCAC ACTAAGATAT TGCTTGAGCC CGAGAGCATG TCCAAGGTGA 4800 

ACCTGAACGG GGTGTTTGAC ATGACCATGA AGATCTGGAA GGTGCTGAGG TACGATGAGA 4860 

CCCGCACCAG GTGCAGACCC TGCGAGTGTG GCGGTAAACA TATTAGGAAC CAGCCTGTGA 4920 

TGCTGGATGT GACCGAGGAG CTGAGGCCCG ATCACTTGGT GCTGGCCTGC ACCCGCGCTG ' 4980 

AGTTTGG CTC TAGCGATGAA GATACAGATT GAGGTACTGA AATGTGTGGG CGTGGCTTAA 504 0 

GGGTGGGAAA GAATATATAA GGTGGGGGTC TTATGTAGTT TTGTATCTGT TTTGCAGCAG 5100 

CCGCCGCCGC CATGAGCACC AACTCGTTTG ATGGAAGCAT TGTGAGCTCA TATTTGACAA 5160 

CGCGCATGCC CCCATGGGCC GGGGTGCGTC AGAATGTGAT GGGCTCCAGC ATTGATGGTC 5220 

GCCCCGTCCT GCCCGCAAAC TCTACTACCT TGACCTACGA GACCGTGTCT GGAACGCCGT 5280 



- 89 - 




WO 98/17783 



PCT/US97/19S41 



TGGAGACTGC 


AGCCTCCGCC 


GCCGCTTCAG 


CCGCTGCAGC 


CACCGCCCGC 


GGGATTGTGA 


5340 


CTGACTTTGC 


TTTCCTGAGC 


CCGCTTGCAA 


GCAGTGCAGC 


TTCCCGTTCA 


TCCGCCCGCG 


5400 


ATGACAAGTT 


GACGGCTCTT 


TTGGCACAAT 


TGGATTCTTT 


GACCCGGGAA 


CTTAATGTCG 


5460 


TTTCTCAGCA 


GCTGTTGGAT 


CTGCGCCAGC 


AGGTTTCTGC 


CCTGAAGGCT 


TCCTCCCCTC 


5520 


CCAATGCGGT 


TTAAAACATA 


AATAAAAAAC 


CAGACTCTGT 


TTGGATTTGG 


ATCAAGCAAG 


5580 


TGTCTTGGTG 


TCTTTATTTA 


GGGGTTTTGC 


GCGCGCGGTA 


GGCCCGGGAC 


CAGCGGTCTC 


5640 


GGTCGTTGAG 


GGTCCTGTGT 


ATTTTTTCCA 


GGACGTGGTA 


AAGGTG ACT C 


TGGATGTTCA 


5700 


GATACATGGG 


CATAAGCCCG 


TCTCTGGGGT 


GGAGGTAGCA 


CCACTGCAGA 


GCTTCATGCT 


5760 


GCGGGGTGGT 


GTTGTAGATG 


ATCCAGTCGT 


AGCAGGAGCG 


CTGGGCGTGG 


TGCCTAAAAA 


5820 


TGTCTTTCAG 


TAGCAAGCTG 


ATTGCCAGGG 


GCAGGCCCTT 


GGTGTAAGTG 


TTTACAAAGC 


5880 


GGTTAAGCTG 


GGATGGGTGC 


ATACGTGGGG 


ATATGAGATG 


CATCTTGGAC 


TGTATTTTTA 


5940 


GGTTGGCTAT 


GTTCCCAGCC 


ATATCCCTCC 


GGGGATTCAT 


GTTGTGCAGA 


ACCACCAGCA 


6000 


CAGTGTATCC 


GGTGCACTTG 


GGAAATTTGT 


CATGTAGCTT 


AGAAGGAAAT 


GCGTGGAAGA 


6060 


ACTTGGAGAC 


GCCCTTGTGA 


CCTCCAAGAT 


TTTCCATGCA 


TTCGTCCATA 


ATGATGGCAA 


6120 


TGGGCCCACG 


GGCGGCGGCC 


TGGGCGAAGA 


TATTTCTGGG 


ATCACTAACG 


TCATAGTTGT 


6180 


GTTCCAGGAT 


GAGATCGTCA 


TAGGCCATTT 


TTACAAAGCG 


CGGGCGGAGG 


GTGCCAGACT 


6240 


GCGGTATAAT 


GGTTCCATCC 


GGCCCAGGGG 


CGTAGTTACC 


CTCACAGATT 


TGCATTTCCC 


6300 


ACGCTTTGAG 


TTCAGATGGG 


GGGATCATGT 


CTACCTGCGG 


GGCGATGAAG 


AAAACGGTTT 


6360 


CCGGGGTAGG 


GGAGATCAGC 


TGGGAAGAAA 


GCAGGTTCCT 


GAGCAGCTGC 


GACTTACCGC 


6420 


AGCCGGTGGG 


CCCGTAAATC 


ACACCTATTA 


CCGGGTGCAA 


CTGGTAGTTA 


AGAGAGCTGC 


6480 


AGCTGCCGTC 


ATCCCTGAGC 


AGGGGGGCCA 


cttcgttaag 


CATGTCCCTG 


ACTCGCATGT 


6540 


TTTCCCTGAC 


CAAATCCGCC 


AGAAGGCGCT 


CGCCGCCCAG 


CGATAGCAGT 


TCTTGCAAGG 


6600 


AAGCAAAGTT 


TTTCAACGGT 


TTGAGACCGT 


CCGCCGTAGG 


CATGCTTTTG 


AGCGTTTGAC 


6660 


CAAGCAGTTC 


CAGGCGGTCC 


CACAGCTCGG 


TCACCTGCTC 


TACGGCATCT 


CGATCCAGCA 


6 72 0 


TATCTCCTCG 


TTTCGCGGGT 


TGGGGCGGCT 


TTCGCTGTAC 


GGCAGTAGTC 


GGTGCTCGTC 


6780 


CAGACGGGCC 


AGGGTCATGT 


CTTTC CACGG 


GCGCAGGGTC 


CTCGTCAGCG 


TAGTCTGGGT 


6 84 0 


CACGGTGAAG 


GGGTGCGCTC 


CGGGCTGCGC 


GCTGGCCAGG 


GTGCGCTTGA 


GGCTGGTCCT 


6900 


GCTGGTGCTG 


AAGCGCTGCC 


GGTCTTCGCC 


CTGCGCGTCG 


GCCAGGTAGC 


ATTTG AC CAT 


6960 


GGTGTCATAG 


TCCAGCCCCT 


CCGCGGCGTG 


GCCCTTGGCG 


CGCAGCTTGC 


CCTTGGAGGA 


7020 


GGCGCCGCAC 


GAGGGGCAGT 


GCAGACTTTT 


GAGGGCGTAG 


AGCTTGGGCG 


CGAGAAATAC 


7080 


CGATTCCGGG 


GAGTAGGCAT 


CCGCGCCGCA 


GGCCCCGCAG 


ACGGTCTCGC 


ATTCCACGAG 


7140 


CCAGGTGAGC 


TCTGGCCGTT 


CGGGGTCAAA 


AACCAGGTTT 


CCCCCATGCT 


ttttgatgcg 


7200 


TTTCTTACCT 


CTGGTTTCCA 


TGAGCCGGTG 


TCCACGCTCG 


GTGACGAAAA 


GGCTGTCCGT 


7260 


GTCCCCGTAT 


ACAGACTTGA 


GAGGCCTGTC 


CTCGAGCGGT 


GTTCCGCGGT 


cctcctcgta 


7320 


TAGAAACTCG 


GACCACTCTG 


AGACAAAGGC 


TCGCGTCCAG 


GCCAGCACGA 


AGGAGGCTAA 


7380 




WO 98/17783 0 £ PCT/US97/.9541 

GTGGGAGGGG TAGCGGTCGT TGTCCACTAG GGGGTCCACT CGCTCCAGGG TGTGAAGACA 74 4 0 

CATGTC3CCC TCTTCGGCAT CAAGGAAGGT GATTGGTTTG TAGGTGTAGG CCACGTGACC 7500 

GGGTGTTCCT GAAGGGGGGC TATAAAAGGG GGTGGGGGCG CGTTCGTCCT CACTCTCTTC 7S60 

CGCATCGCTG TCTGCGAGGG CCAGCTGTTG GGGTGAGTAC TCCCTCTGAA AAGCGGGCAT 7620 

GACTTCTGCG CTAAGATTGT CAGTTTCCAA AAACGAGGAG GATTTGATAT TCACCTGGCC 7680 

• CGCGGTGATG CCTTTGAGGG TGGCCGCATC CATCTGGTCA GAAAAGACAA TCTTTTTGTT 774 0 

GTCAAGCTTG GTGGCAAACG ACCCGTAGAG GGCGTTGGAC AGCAACTTGG CGATGGAGCG 7800 

CAGGGTTTGG tttttgtcgc GATCGGCGCG CTCCTTGGCC GCGATGTTTA GCTGCACGTA 7860 

TTCGCGCGCA ACGCACCGCC ATTCGGGAAA GACGGTGGTG CGCTCGTCGG GCACCAGGTG 7920 

CACGCGCCAA CCGCGGTTGT GCAGGGTGAC AAGGTCAACG CTGGTGG CTA CCTCTCCGCG 798 0 

TAGGCGCTCG TTGGTCCAGC AGAGGCGGCC GCCCTTGCGC GAGCAGAATG GCGGTAGGGG 8 04 0 

. GTCTAG CTG C GTCTCGTCCG GGGGGTCTGC GTCCACGGTA AAGACCCCGG GCAGCAGGCG 8100 

CGCGTCGAAG TAGTCTATCT TGCATCCTTG CAAGTCTAGC GCCTGCTGCC ATG CGCGGGC 8160 

GGCAAGCGCG CGCTCGTATG GGTTGAGTGG GGGACCCCAT GGCATGGGGT GGGTGAGCGC 8220 

GGAGGCGTAC ATGCCGCAAA TGTCGTAAAC GTAGAGGGGC TCTCTGAGTA TTCCAAGATA 8280 

TGTAGGGTAG CATCTTCCAC CGCGGATGCT GGCGCGCACG TAATCGTATA GTTCGTGCGA 8340 

GGGAG CGAGG AGG TCGGGAC CGAGGTTGCT ACGGGCGGGC TGCTCTG CTC GGAAGACTAT 84 00 

CTGCCTGAAG ATGGCATGTG AGTTGGATGA TATGGTTGGA CGCTGGAAGA CGTTGAAGCT 8460 

GGCGTCTGTG AGACCTACCG CGTCACGCAC GAAGGAGGCG TAGGAGTCGC GCAGCTTGTT 8520 

GACCAGCTCG GCGGTGACCT GCACGTCTAG GGCGCAGTAG TCCAGGGTTT CCTTGATGAT 8580 

GTCATACTTA TCCTGTCCCT TTTTTTTCCA CAGCTCGCGG TTGAGGACAA ACTCTTCGCG 8640 

GTCTTTCCAG TACTCTTGGA TCGGAAACCC GTCGGCCTCC GAACGGTAAG AGCCTAGCAT 8700 

GTAGAACTGG TTGACGGCCT GGTAGGCGCA GCATCCCTTT TCTACGGGTA GCGCGTATGC 8760 

CTGCGCGGCC TTCCGGAGCG AGGTGTGGGT GAGCGCAAAG GTGTCCCTGA CCATGACTTT 8820 

...... GAGGTACTGG TATTTGAAGT CAGTGTCGTC GCATCCGCCC TGCTCCCAGA GCAAAAAGTC 8880 

■ cgtg cgcttt TTGGAACGCG GATTTGGCAG GGCGAAGGTG ACATCGTTGA AGAGTATCTT 8940 

TCCCGCGCGA GGCATAAAGT TGCGTGTGAT GCGGAAGGGT CCCGGCACCT CGGAACGGTT 9000 

GTTAATTACC TGGGCGGCGA GCACGATCTC GTCAAAGCCG TTGATGTTGT GGCCCACAAT 9060 

... GTAAAGTTCC AAGAAGCGCG GGATGCCCTT GATGGAAGGC AATTTTTTAA GTTCCTCGTA 912 0 

GGTGAGCTCT TCAGGGGAGC TGAGCCCGTG CTCTGAAAGG GCCCAGTCTG CAAGATGAGG 9180 

GTTGGAAGCG acgaatgagc tccacaggtc ACGGGCCATT AGCATTTGCA GGTGGTCGCG 924 0 

AAAGGTCCTA AACTGGCGAC CTATGGCCAT TTTTTCTGGG GTGATGCAGT AGAAGGTAAG 9300 

CGGGTCTTGT TCCCAGCGGT CCCATCCAAG GTTCGCGGCT AGGTCTCGCG CGGCAGTCAC 9360 

TAG AGG CTC A TCTCCGCCGA ACTTCATGAC CAGCATGAAG GGCACGAGCT GCTTCCCAAA 94 2 0 

GGCCCCCATC CAAGTATAGG TCTCTACATC GTAGGTGACA AAGAGACGCT CGGTGCGAGG 9480 
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ATGCGAGCCG 

GTGGTGAAAG 

TGCGCAGTAC 

GCGCACAAGG . 

TTCTACTTCG 

GACCACCACG 

GACAACATCG 

CGGGAGCTCC 

ATAC CTAATT 

CCGCGGCGCG 

TGCATCTAAA 

GGGAGAGGGG 

AGGTTGCTGG 

AAGACGACGG 

TCGTTGACGG 

ATCTCGGCCA 

ACGGTGGCGG 

CCCTCGTTCC 

ACCTGCGCGA 

AAGAGGTAGT 

CGCAACGTGG 

TCCACGGCGA 

AGACGGATGA 

TCTTCTTCTT 

GGGGGAGGGG 

ATCATCTCCC 

CGCAGTTGGA 

GGCAGGGATA 

AGGGACCTGA 

CAGTCACAGT 

TTGTTTCTGG 

ATGGTCGACA 

ATGCCCCAGG 

TCTACCGGCA 

GCGGCGGCGG 
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ATCGGGAAGA 


ACTGGATCTC 


CCGCCACCAA 


TTGGAGGAGT 


GGCTATTGAT 


9540 


TAGAAGTCCC 


TGCGACGGGC 


CGAACACTCG 


TGCTGGCTTT 


TGTAAAAACG 


9600 


TGGCAGCGGT 


GCACGGGCTG 


TACATCCTGC 


ACGAGGTTGA 


CCTGACGACC 


9660 


AAGCAGAGTG 


GGAATTTGAG 


CCCCTCGCCT 


GGCGGGTTTG 


GCTGGTGGTC 


9720 


GCTGCTTGTC 


CTTGACCGTC 


TGGCTGCTCG 


AGGGGAGTTA 


CGGTGGATCG 


9780 


CCGCGCGAGC 


CCAAAGTCCA 


GATGTCCGCG 


CGCGGCGGTC 


GGAGCTTGAT 


9840 


CGCAGATGGG 


AGCTGTCCAT 


GGTCTGGAGC 


TCCCGCGGCG 


TCAGGTCAGG 


9900 


TGCAGGTTTA 


CCTCGCATAG 


ACGGGTCAGG 


GCGCGGGCTA 


GATCCAGGTG 


9960 


TCCAGGGGCT 


GGTTGGTGGC 


GGCGTCGATG 


GCTTGCAAGA 


GGCCGCATCC 


10020 


ACTACGGTAC 


CGCGCGGCGG 


GCGGTGGGCC 


GCGGGGGTGT 


ccttggatga 


10080 


AGCGGTGACG 


CGGGCGAGCC 


CCCGGAGGTA 


GGGGGGGCTC 


CGGACCCGCC 


10140 


GCAGGGGCAC 


GTCGGCGCCG 


CGCGCGGGCA 


GGAGCTGGTG 


CTGCGCGCGT 


10200 


CGAACGCGAC 


GACGCGGCGG 


TTGATCTCCT 


GAATCTGGCG 


CCTCTGCGTG 


10260 


GCCCGGTGAG 


CTTGAGCCTG 


AAAGAGAGTT 


CGACAGAATC 


AATTTCGGTG 


10320 


CGGCCTGGCG 


CAAAATCTCC 


TGCACGTCTC 


CTGAGTTGTC 


TTGATAGGCG 


10380 


TGAACTGCTC 


GATCTCTTCC 


TCCTGG AG AT 


CTCCGCGTCC 


GGCTCGCTCC 


10440 


CGAGGTCGTT 


GGAAATGCGG 


GCCATGAGCT 


GCGAGAAGGC 


GTTGAGGCCT 


10500 


AGACGCGGCT 


GTAGACCACG 


CCCCCTTCGG 


CATCGCGGGC 


GCGCATGACC 


10560 


GATTGAGCTC 


CACGTGCCGG 


GCGAAGACGG 


CGTAGTTTCG 


CAGGCGCTGA 


10620 


TGAGGGTGGT 


GGCGGTGTGT 


TCTGCCACGA 


AGAAGTACAT 


AACCCAGCGT 


10680 


ATTCGTTGAT 


ATCCCCCAAG 


GCCTCAAGGC 


GCTCCATGGC 


CTCGTAGAAG 


10740 


AGTTGAAAAA 


CTGGGAGTTG 


CGCGCCGACA 


CGGTTAACTC 


CTCCTCCAGA 


10800 


GCTCGGCGAC 


AGTGTCGCGC 


ACCTCGCGCT 


CAAAGGCTAC 


AGGGGCCTCT 


10860 


CAATCTCCTC 


TTCCATAAGG 


GCCTCCCCTT 


CTTCTTCTTC 


TGGCGGCGGT 


10920 


GGACACGGCG 


GCGACGACGG 


CGCACCGGGA 


GGCGGTCGAC 


AAAGCGCTCG 


10980 


CGCGGCGACG 


GCGCATGGTC 


TCGGTGACGG 


CGCGGCCGTT 


CTCGCGGGGG 


11040 


AGACGCCGCC 


CGTCATGTCC 


CGGTTATGGG 


TTGGCGGGGG 


GCTGCCATGC 


11100 


CGGCGCTAAC 


GATGCATCTC 


AACAATTGTT 


GTGTAGGTAC 


TCCGCCGCCG 


11160 


GCGAGTCCGC 


ATCGACCGGA 


TCGGAAAACC 


TCTCGAGAAA 


GGCGTCTAAC 


11220 


CGCAAGGTAG 


GCTGAGCACC 


GTGGCGGGCG 


GCAGCGGGCG 


GCGGTCGGGG 


11280 


CGGAGGTGCT 


GCTGATGATG 


TAATTAAAGT 


AGGCGGTCTT 


GAGACGGCGG 


11340 


GAAGCACCAT 


GTCCTTGGGT 


CCGGCCTGCT 


GAATGCGCAG 


GCGGTCGGCC 


11400 


CTTCGTTTTG 


ACATCGGCGC 


AGGTCTTTGT 


AGTAGTCTTG 


C ATG AGCCTT 


11460 


CTTCTTCTTC 


TCCTTCCTCT 


TGTCCTGCAT 


CTCTTGCATC 


TATCGCTGCG 


11520 


AGTTTGGCCG 


TAGGTGGCGC 


CCTCTTCCTC 


CCATGCGTGT 


GACCCCGAAG 


11580 
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CCCCTCA. i >_G GCTGAAGCAG GGCTAGGTCG GCGACAACGC GCTCGGCTAA TATGGCCTGC 11640 

TGCACCTGCG TGAGGGTAGA CTGGAAGTCA TCCATGTCCA CAAAGCGGTG GTATGCGCCC 11700 

GTGTTGATGG TGTAAGTGCA GTTGGCCAT A ACGGACCAGT T AACGGTCTG GTGACCCGGC 1176 0 

TGCGAGAGCT CGGTGTACCT GAGACGCGAG TAAGCCCTCG AGTCAAATAC GTAGTCGTTG 11820 

CAAGTCCGCA CCAGGTACTG GTATCCCACC AAAAAGTGCG GCGGCGGCTG GCGGTAGAGG 11880 

GGCCAGCGTA GGGTGGCCGG GGCTCCGGGG GCGAGATCTT C CAACATAAG GCGATGATAT 11940 

CCGTAGATGT ACCTGGACAT CCAGGTGATG CCGGCGGCGG TGGTGGAGGC GCGCGGAAAG 12000 

TCGCGGACGC GGTTCCAGAT GTTGCGCAGC GGCAAAAAGT GCTCCATGGT CGGGACGCTC 12060 

TGGCCGGTCA GGCGCGCGCA ATCGTTGACG CTCTACCGTG CAAAAGGAGA GCCTGTAAGC 12120 

GGGCACTCTT CCGTGGTCTG GTGGATAAAT TCGCAAGGGT ATCATGGCGG ACGACCGGGG 12180 

TTCGAGCCCC GTATCCGGCC GTCCGCCGTG ATCCATGCGG TTACCGCCCG CGTGTCGAAC 12240 

CCAGGTGTGC GACGTCAGAC AACGGGGGAG TGCTCCTTTT GGCTTCCTTC CAGGCGCGGC 12300 

GGCTGCTGCG CTAGCTTTTT TGGCCACTGG CCGCGCGCAG CGTAAGCGGT TAGGCTGGAA 12360 

AGCGAAAGCA TTAAGTGGCT CGCTCCCTGT AGCCGGAGGG TTATTTTCCA AGGGTTGAGT 12420 

CGCGGG^CCC CCGGTTCGAG TCTCGGACCG GCCGGACTGC GGCGAACGGG GGTTTGCCTC 12480 

CCCGTCATGC AAGACCCCGC TTGCAAATTC CTCCGGAAAC AGGGACGAGC CCCTTTTTTG 12540 

CTTTTCCCAG ATGCATCCGG TGCTGCGGCA GATGCGCCCC CCTCCTCAGC AGCGGCAAGA 12600 
GCAAGAGCAG CGGCAGACAT GCAGGGCACC CTCCCCTCCT CCTACCGCGT CAGGAGGGGC 12660 

GACATCCGCG GTTGACGCGG CAGCAGATGG TGATTACGAA CCCCCGCGGC GCCGGGCCGG 12720 

GCACTACCTG GACTTGGAGG AGGGCGAGGG CCTGGCGCGG CTAGGAGCGC CCTCTCCTGA 12780 

GCGGTACCCA AGGGTGCAGC TGAAGCGTGA TACGCGTGAG GCGTACGTGC' CGCGGCAGAA 12 84 0 

CCTGT » i CGC GACCGCGAGG GAGAGGAGCC CGAGGAGATG CGGGATCGAA AGTTCCACGC 12900 

AGGGCGCGAG CTGCGGCATG GCCTGAATCG CGAGCGGTTG CTGCGCGAGG AGGACTTTGA 12960 

GCCCGACGCG CGAACCGGGA TTAGTCCCGC GCGCGCACAC GTGGCGGCCG CCGACCTGGT 13020 

AACCGCATAC GAGCAGACGG TGAACCAGGA GATTAACTTT CAAAAAAGCT TTAACAACCA 13080 

CGTGCGTACG CTTGTGGCGC GCGAGGAGGT GGCTATAGGA CTGATGCATC TGTGGGACTT 13140 

TGTAAGCGCG CTGGAGCAAA ACCCAAATAG CAAGCCGCTC ATGGCGCAGC TGTTCCTTAT 13200 

AGTGCAGCAC AGCAGGGACA ACGAGGCATT CAGGGATGCG CTGCTAAACA TAGTAGAGCC 13260 

CGAGGGCCGC TGGCTGCTCG ATTTGATAAA CATCCTGCAG AG CAT AG TGG TGCAGGAGCG 13320 

CAGCTTGAGC CTGGCTGACA AGGTGGCCGC CATCAACTAT TCCATGCTTA GCCTGGGCAA 13380 

GTTTTACGCC CGCAAGATAT ACCATACCCC TTACGTTCCC ATAGACAAGG AGGTAAAGAT 13440 

CGAGGGG TTC TACATGCGCA TGGCGCTGAA GGTGCTTACC TTGAGCGACG ACCTGGGCGT 13 500 

TTATCGCAAC GAGCGCATCC ACAAGGCCGT GAGCGTGAGC CGGCGGCGCG AGCTCAGCGA 13560 

CCGCGAGCTG ATGCACAGCC TGCAAAGGGC CCTGGCTGGC ACGGGCAGCG GCGATAGAGA 13620 

GGCCGAGTCC TACTTTGACG CGGGCGCTGA CCTGCGCTGG GCCCCAAGCC GACGCGCCCT 13680 
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GGAGGCAGCT 

CGGCGTGGAG 

GGTGATGTTT 

CAGAGCCAGC 

ATGTCGCTGA 

TCCGCAATTC 

GCGATCGTAA 

TACGACGCGC 

GACCGGCTGG 

GGCAACCTGG 

CCGCGGGGAC 

ACACCGCAAA 

GGCCTGCAGA 

CGGGCTCCCA 

TTGCTGCTGC 

GGTCACTTGC 

TTCCAGGAGA 

GCAACCCTAA 

AACAGCGAGG 

CGCGACGGGG 

ATGTATGCCT 

GCCGTGAACC 

GGTTTCTACA 

ATAGACGACA 

CAGGCAGAGG 

GGCGCTGCGG 

ACCAGCACTC 

CTGCTGCAGC 

AGCCTAGTGG 

GGCCCGCGCC 

GACGATGACT 

GCGCACCTTC 

AAAAAACTCA 

GGCGCGCGGC 

CGCCAGTGGC 
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GGGGCCGGAC CTGGGCTGGC 
GAATATGACG AGGACGATGA 
CTGATCAGAT GATGCAAGAC 
CGTCCGGCCT TAACTCCACG 
CTGCGCGCAA TCCTGACGCG 
TGGAAGCGGT GGTCCCGGCG 
ACGCGCT GGC CGAAAACAGG 
TGCTTCAGCG CGTGGCTCGT 
TGGGGGATGT GCGCGAGGCC 
GCTCCATGGT TGCACTAAAC 
AGGAGGACTA CACCAACTTT 
GTGAGGTGTA CCAGTCTGGG 
CCGTAAACCT GAGCCAGGCT 
CAGGCGACCG CGCGACCGTG 
TAATAGCGCC CTTCACGGAC 
TGACACTGTA CCGCGAGGCC 
TTACAAGTGT CAGCCGCGCG 
ACTACCTGCT GACCAACCGG 
AGGAGCGCAT TTTGCGCTAC 
TAACGCCCAG CGTGGCGCTG 
CAAACCGGCC GTTTATCAAC 
CCGAGTATTT CACCAATGCC 
CCGGGGGATT CGAGGTGCCC 
GCGTGTTTTC . CCCGCAACCG 
CGGCGCTGCG AAAGGAAAGC 
CCCCGCGGTC AGATGCTAGT 
GCACCACCCG CCCGCGCCTG 
CGCAGCGCGA AAAAAACCTG 
ACAAGATGAG TAGATGGAAG 
CGCCCACCCG TCGTCAAAGG 
CGGCAGACGA CAGCAGCGTC 
GCCCCAGGCT GGGGAGAATG 
CCAAGGCCAT GGC ACCGAG C 
GATGTATGAG GAAGGTCCTC 
GGCGGCGCTG GGXTCTCCCT 



GGTGGCACCC GCGCGCGCTG 
GTACGAGCCA GAGGACGGCG 
GCAACGGACC CGGCGGTGCG 
GACGACTGGC GCCAGGTCAT 
TTCCGGCAGC AGCCGCAGGC 
CGCGCAAACC CCACGCACGA 
GCCATCCGGC CCGACGAGGC 
TACAACAGCG GCAACGTGCA 
GTGGCGCAGC GTGAGCGCGC 
GCCTTCCTGA GTACACAGCC 
GTGAGCGCAC TGCGGCTAAT 
CCAGACTATT TTTTCCAGAC 
TTCAAAAACT TGCAGGGGCT 
TCTAGCTTGC TGACGCCCAA 
AGTGGCAGCG TGTCCCGGGA 
ATAGGTCAGG CGCATGTGGA 
CTGGGGCAGG AGGACACGGG 
CGGCAGAAGA TCCCCTCGTT 
GTGCAGCAGA GCGTGAGCCT 
GACATGACCG CGCGCAACAT 
CGCCTAATGG ACTACTTGCA 
ATCTTG AACC CGCACTGGCT 
GAGGGTAACG ATGGATTCCT 
CAGACCCTGC TAGAGTTGCA 
TTCCGCAGGC CAAGCAGCTT 
AGCCCATTTC CAAGCTTGAT 
CTGGGCGAGG AGGAGTACCT 
C CTC CGG CAT TTCCCAACAA 
ACGTACGCGC AGG AG C AC AG 
CACGACCGTC AGCGGGGTCT 
CTGGATTTGG GAGGGAGTGG 
TTTTAAAAAA AAAAAAGCAT 
GTTGGTTTTC TTGTATTCCC 
CTCCCTCCTA CGAGAGTGTG 
TCGATGCTCC CCTGGACCCG 



GCAACGTCGG 


13740 


AGTACTAAGC 


13800 


GGCGGCGCTG 


13860 


GGACCGCATC 


13920 


CAACCGGCTC 


13980 


GAAGGTGCTG 


14040 


CGGCCTGGTC 


14100 


GACCAACCTG 


14160 


GCAGCAGCAG 


14220 


CGCCAACGTG 


14280 


GGTGACTGAG 


14340 


CAGTAGACAA 


14400 


GTGGGGGGTG 


14460 


CTCGCGCCTG 


14520 


CACATACCTA 


14580 


CG AG CAT ACT 


14640 


CAGCCTGGAG 


14700 


G C ACAGTTT A 


14760 


TAACCTGATG 


14820 


GGAACCGGGC 


14880 


TCGCGCGGCC 


14940 


ACCGCCCCCT 


15000 


CTGGGACGAC 


15060 


ACAGCGCGAG 


15120 


GTCCGATCTA 


15180 


AGGGTCTCTT 


15240 


AAACAACTCG 


15300 


CGGGATAGAG 


15360 


GGACGTGCCA 


15420 


GGTGTGGGAG 


15480 


CAACCCGTTT 


15540 


GATGCAAAAT 


15600 


CTTAGTATGC 


15660 


GTGAGCGCGG 


15720 


CCGTTTGTGC 


15780 
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CTCCGCGGTA 



CCCTATTCGA 
7GAACTACCA 
GCCCGGGGGA 
ACCTGAAAAC 
AGTTTAAGGC 
AATACGAGTG 
ACCTTATGAA 
TGGAAAG CGA 
TCACTGGTCT 

- TGCTGCCAGG 
GCAAGCGGC A 

- ACATTCCCGC 
AGGGCGGGGG 
ACGCGGCAGC 
ACACCTTTG C • 
CCGCCCCCGC 
TGACAGAGGA 
AGTACCGCAG 
GGACCCTGCT 
CAGACATGAT 
TGGTGGGCGC 
ACTCCCAACT 
ACCAGATTTT 
> CTCTC AC AG A 
- CCATTACTG A 
CGCCGCGCGT 
GCAATAACAC 
•GCTCCGACCA 
' AACGCGGCCG 

- CGCGCAACTA 
TGGTGCGCGG 
GCCACCGCCG 
CACGTCGCAC 
TCACTGTGCC 



CCTGCGGCCT 
CACCACCCGT 
GAACGACCAC 
GGCAAGCACA 
CATC CTG CAT 
GCGGGTGATG 
GGTGGAGTTC 
CAACGCGATC 
CATCGGGGTA 
TGTCATGCCT 
ATGCGGGGTG 
ACCCTTCCAG 
ACTGTTGGAT 
TGGCGCAGGC 
CGCGGCAATG 
CACACGGGCT 
TGCGCAACCC 
CAG CAAGAAA 
CTGGTACCTT 
TTGCACTCCT 
GCAAGACCCC 
CGAGCTGTTG 
CATCCGCCAG 
GGCGCGCCCG 
TCACGGGACG 
CGCCAGACGC 
CCTATCGAGC 
AGGCTGGGGC 
ACACCCAGTG 
CACTGGGCGC 
CACGCCCACG 
AGCCCGGCGC 
CCGACCCGGC 
CGGCCGACGG 
CCCCAGGTCC 



ACCGGGGGGA 
GTGTACCTGG 
AG CAACTTTC 
CAG AC CATC A 
ACCAACATGC 
GTGTCGCGCT 
ACGCTGCCCG 
GTGGAGCACT 
AAGTTTGACA 
GGGG TATATA 
GACTTCACCC 
GAGGGCTTTA 
GTGGACGCCT 
GGCAGCAACA 
CAGCCGGTGG 
GAGGAGAAGC 
GAGGTCGAGA 
CGCAGTTACA 
GCATACAACT 
GACGTAACCT 
GTGACCTTCC 
CCCGTGCACT 
TTTACCTCTC 
CCAGCCCCCA 
CTACCGCTGC 
CGCACCTGCC 
CGCACTTTTT 
CTGCGCTTCC 
CGCGTGCGCG 
ACCACCGTCG 
CCGCCACCAG 
TATGCTAAAA 
ACTGCCGCCC 
GCGGCCATGC 
AGGCGACGAG 



GAAACAGCAT 

TGGACAACAA 

TGACCACGGT 

ATCTTGACGA 

CAAATGTGAA 

TGCCTACTAA 

AGGGCAACTA 

ACTTGAAAGT 

CCCGCAACTT 

CAAACGAAGC 

ACAGCCGCCT 

GGATCACCTA 

ACCAGGCGAG 

GCAGTGGCAG 

AGGACATGAA 

GCGCTGAGGC 

AGCCTCAGAA 

ACCTAATAAG 

ACGGCGACCC 

GCGGCTCGGA 

GCTCCACGCG 

CCAAGAGCTT 

TGACCCACGT 

CCATCACCAC 

GCAACAGCAT 

CCTACGTTTA 

G AGC AAG CAT 

CAAGCAAGAT 

GGCACTACCG 

ATGACGCCAT 

TGTCCACAGT 

TGAAGAGACG 

AACGCGCGGC 

GGGCCGCTCG 

CGGCCGCCGC 



CCGTTACTCT 

GTCAACGGAT 

CATTCAAAAC 

CCGGTCGCAC 

CGAGTTCATG 

GGACAATCAG 

CTCCGAGACC 

GGG CAGACAG 

CAGACTGGGG 

CTTCCATCCA 

GAGCAACTTG 

CGATGATCTG 

CTTGAAAGAT 

CGGCGCGGAA 

CGATCATGCC 

CGAAGCAGCG 

GAAACCGGTG 

CAATGACAGC 

TCAGACCGGA 

GCAGGTCTAC 

CCAGATCAGC 

CTACAACGAC 

GTTCAATCGC 

CGTCAGTGAA 

CGGAGGAGTC 

CAAGGCCCTG 

GTCCATCCTT 

GTTTGGCGGG 

CGCGCCCTGG 

CGACGCGGTG 

GGACGCGGCC 

GCGGAGGCGC 

GGCGGCCCTG 

AAGGCTGGCC 

AGCAGCCGCG 
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GAGTTGGCAC 15840 
GTGGCATCCC 15900 
AATGACTACA 15960 
TGGGGCGGCG 16020 
TTTACCAATA 16080 
GTGGAGCTGA 16140 
ATGACCATAG 16200 
AAGGGGGTTC 16260 
TTTGACCCCG 16320 
G AC ATCATTT 16380 

TTGGGCATCC 16440 
GAGGGTGGTA 16500 
GACACCGAAC 16560 
GAGAACTCCA 16620 
ATTCGCGGCG 16680 
GCCGAAGCTG 16740 
ATCAAACCCC 16800 
ACCTTCACCC 16860 
ATCCGCTCAT 16920 
TGGTCGTTGC 16980 
AACTTTCCGG 17040 
CAGGCCGTCT 17100 
TTTCCCGAGA 17160 
AACGTTCCTG 17220 
CAGCGAGTGA 17280 
GGCATAGTCT 17340 
ATATCGCCCA 17400 
GCCAAGAAGC 17460 
GGCGCGCACA 17520 
GTGG AGG AGG 17580 

ATTCAG ACCG 17640 

GTAGCACGTC 17700 
CTTAACCGCG 17760 
GCGGGTATTG 17820 
GCCATTAGTG 17880 
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CTATGACTCA 


GGGTCGCAGG 


GGCAACGTGT 


ATTGGGTGCG 


CGACTCGGTT 


AGCGGCCTGC 


17940 


GCGTGCCCGT 


GCGCACCCGC 


CCCCCGCGCA 


ACTAGATTGC 


AAGAAAAAAC 


TACTTAGACT 


18000 


CGTACTGTTG 


TATGTATCCA 


GCGGCGGCGG 


CGCGCAACGA 


AGCTATGTCC 


AAGCGCAAAA 


18060 


TCAAAGAAGA 


GATGCTCCAG 


GTCATCGCGC 


CGGAGATCTA 


TGGCCCCCCG 


AAGAAGGAAG 


18120 


AGCAGGATTA 


CAAGCCCCGA 


AAGCTAAAGC 


GGGTCAAAAA 


GAAAAAGAAA 


GATGATGATG 


18180 


ATGAACTTGA 


CGACGAGGTG 


GAACTGCTGC 


ACGCTACCGC 


GCCCAGGCGA 


CGGGTACAGT 


18240 


GGAAAGGTCG 


ACGCGTAAAA 


CGTGTTTTGC 


GACCCGGCAC 


CACCGTAGTC 


TTTACGCCCG 


18300 


GTGAGCGCTC 


CACCCGCACC 


TACAAGCGCG 


TGTATGATGA 


GGTGTACGGC 


GACGAGGACC 


18360 


TGCTTGAGCA 


GGCCAACGAG 


CGCCTCGGGG 


AGTTTG CCTA 


CGGAAAGCGG 


CATAAGGACA 


18420 


TGCTGGCGTT 


GCCGCTGGAC 


GAGGGCAACC 


CAACACCTAG 


CCTAAAGCCC 


GTAACACTGC 


18480 


AGCAGGTGCT 


GCCCGCGCTT 


GCACCGTCCG 


AAGAAAAGCG 


CGGCCTAAAG 


CGCGAGTCTG 


18540 


GTGACTTGGC 


ACCCACCGTG 


CAGCTGATGG 


TACCCAAGCG 


CCAGCGACTG 


GAAGATGTCT 


18600 


TGGAAAAAAT 


GACCGTGGAA 


CCTGGGCTGG 


AGCCCGAGGT 


CCGCGTGCGG 


CCAATCAAGC 


18660 


AGGTGGCGCC 


GGGACTGGGC 


GTGCAGACCG 


TGGACGTTCA 


GATACCCACT 


ACCAGTAGCA 


18720 


CCAGTATTGC 


CACCGCCACA 


GAGGGCATGG 


AGACACAAAC 


GTCCCCGGTT 


GCCTCAGCGG 


18780 


TGGCGGATGC 


CGC GGTGCAG 


GCGGTCGCTG 


CGGCCGCGTC 


CAAGACCTCT 


ACGGAGGTGC 


18840 


AAACGGACCC 


GTGGATGTTT 


CGCGTTTCAG 


CCCCCCGGCG 


CCCGCGCGGT 


TCGAGGAAGT 


18900 


ACGGCGCCGC 


CAGCGCGCTA 


CTGCCCGAAT 


ATGCCCTACA 


TCCTTCCATT 


GCGCCTACCC 


18960 


CCGGCTATCG 


TGGCTACACC 


TACCGCCCCA 


GAAGACGAGC 


AACTACCCGA 


CGCCGAACCA 


19020 


CCACTGGAAC 


CCGCCGCCGC 


CGTCGCCGTC 


GCCAGCCCGT 


GCTGGCCCCG 


ATTTCCGTGC 


19080 


GCAGGGTGGC 


TCGCGAAGGA 


GGCAGGACCC 


TGGTGCTGCC 


AACAGCGCGC 


TACCACCCCA 


19140 


GCATCGTTTA 


AAAGCCGGTC 


TTTGTGGTTC 


TTGCAGATAT 


GGCCCTCACC 


TGCCGCCTCC 


19200 


GTTTCCCGGT 


GCCGGGATTC 


CGAGGAAGAA 


TGCACCGTAG 


G AGGGGCATG 


GCCGGCCACG 


19260 


GCCTGACGGG 


CGGCATGCGT 


CGTGCGCACC 


ACCGGCGGCG 


GCGCGCGTCG 


CACCGTCGCA 


19320 


TGCGCGGCGG 


TATCCTGCCC 


CTCCTTATTC 


CACTGATCGC 


CGCGGCGATT 


GGCGCCGTGC 


19380 


CCGGAATTGC 


ATCCGTGGCC 


TTGCAGGCGC 


AGAGACACTG 


ATTAAAAAC A 


AGTTGCATGT 


19440 


GG AAAAATCA 


AAATAAAAAG 


TCTGGACTCT 


CACGCTCGCT 


TGGTCCTGTA 


AC TATTTTGT 


19500 


AGAATGGAAG 


ACATCAACTT 


TGCGTCTCTG 


GCCCCGCGAC 


ACGGCTCGCG 


CCCGTTCATG 


19560 


GGAAACTGGC 


AAGATATCGG 


CACCAGCAAT 


ATGAGCGGTG 


GCGCCTTCAG 


CTGGGGCTCG 


19620 


CTGTGGAGCG 


G C ATT AAAAA 


TTTCGGTTCC 


ACCGTTAAGA 


ACTATGGCAG 


CAAGGCCTGG 


19680 


AACAGCAGCA 


CAGGCCAGAT 


GCTGAGGGAT 


AAGTTGAAAG 


AG C AAAATTT 


CCAACAAAAG 


19740 


GTGGTAGATG 


GCCTGGCCTC 


TGGCATTAGC 


GGGGTGGTGG 


ACCTGGCCAA 


CCAGGCAGTG 


19800 


CAAAATAAGA 


TTAACAGTAA 


GCTTGATCCC 


CGCCCTCCCG 


TAGAGGAGCC 


TCCACCGGCC 


19860 


GTGGAGACAG 


TGTCTCCAGA 


GGGGCGTGGC 


GAAAAGCGTC 


CGCGCCCCGA 


CAGGGAAGAA 


19920 


ACTCTGGTGA 


CGCAAATAGA 


CGAGCCTCCC 


TCGTACGAGG 


AGGCACTAAA 


GCAAGGCCTG 


19980 




WO 98/17783 

CCCACCACCC 



ACGCTGGACC 
GCCGTTGTTG 
TCGTTGCGGC 
GGGGTGCAAT 
. CATGTATGCG 
AGATGGCTAC 
. CCTCGGAGTA 
GCCTGAATAA 
GGTCCCAGCG 
: ACAAGGCGCG 
./ ACTTTGACAT 
CCTACAACGC 
CTGCTCTTGA 
AAGCTGAGCA 
CAAAGGAGGG 
TTCAACCTGA 
CTGGGAGAGT 
CCACAAATGA 
GTCAAGTGGA 
TGACTCCTAA 
TTTCTTACAT 
TGCCCAACAG 
ACAGCACGGG 
ATTTGCAAGA 
GAACCAGGTA 
TTATTGAAAA 
TGATTAATAC 
.AAAAAG ATG C 
TGGAAATCAA 
ATTTGCCCGA 
CCTACGACTA 
TTGGAGCACG 
ATGCTGGCCT 
TCCAGGTGCC 



GTCCCATCGC 

TGCCT CCCCC 

TAACCCGTCC 

CCGTAGCCAG 

CCCTGAAGCG 

TCCATGTCGC 

CCCTTCGATG 

CCTGAGCCCC 

CAAGTTTAGA 

TTTGACGCTG 

GTTCACCCTA 

CCGCGGCGTG 

CCTGGCTCCC 

AATAAAC CT A 

GCAAAAAACT 

TATTCAAATA 

ACCTCAAATA 

CCTT AAAAAG 

AAATGGAGGG 

AATGCAATTT 

AGTGGTATTG 

GCCCACTATT 

GCCTAATTAC 

TAATATGGGT 

CAGAAACACA 

CTTTTCTATG 

TCATGGAACT 

AGAGACTCTT 

TACAGAATTT 

TCTAAATGCC 

CAAGCTAAAG 

CATGAACAAG 

CTGGTCCCTT 

GCGCTACCGC 

TC AGAAGTTC 



GCCCATGGCT. 
CGCCGACACC 
TAGCCGCGCG 
TGGCAACTGG 
CCGACGATGC 
CGCCAGAGGA 
ATGCCGCAGT 
GGGCTGGTGC 
AACCCCACGG 
CGGTTCATCC 
GCTGTGGGTG 
CTGGACAGGG 
AAGGGTGCCC 
GAAGAAGAGG 
CACGTATTTG 
GGTGTCGAAG 
GGAGAATCTC 
ACTACCCCAA 
CAAGGCATTC 
TTCTCAACTA 
TACAGTGAAG 
AAGGAAGGTA 
ATTGCTTTTA 
GTTCTGGCGG 
GAGCTTTCAT 
TGGAATCAGG 
GAAGATGAAC 
ACCAAGGTAA 
TCAGATAAAA 
AAC CTGTGG A 
T ACAGTCCTT 
CGAGTGGTGG 
GACTATATGG 
TCAATGTTGC 
TTTGCCATTA 
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ACCGGAGTGC TGGGCCAGCA CACACCCGTA 2004 0 

CAGCAGAAAC CTGTGCTGCC AGGCCCGACC 20100 
TCCCTGCGCC GCGCCGCCA G CGGTCCGCGA 20160 
CAAAGCACAC TG AAC AG CAT CGTGGGTCTG 20220 
TTCTGAATAG CTAACGTGTC GTATGTGTGT 20280 
GCTGCTGAGC CGCCGCGCGC CCGCTTTCCA 20340 
GGTCTTACAT GCACATCTCG GGCCAGGACG 20400 
AGTTTGCCCG CGCCACCGAG ACGTACTTCA 20460 
TGGCGCCTAC GCACGACGTG ACCACAGACC 20520 
CTGTGGACCG TGAGGATACT GCGTACTCGT 20580 
ATAACCGTGT GCTGGACATG GCTTCCACGT 2064 0 
GCCCTACTTT TAAGCCCTAC TCTGGCACTG 20700 
CAAATCCTTG CGAATGGGAT GAAGCTGCTA 2 0 760 
ACGATGACAA CGAAGACGAA GTAGACGAGC 2 082 0 
GGCAGGCGCC TTATTCTGGT ATAAATATTA 2088 0 
GTCAAACACC TAAATATGCC GATAAAACAT 20940 
AGTGGTACGA AACTGAAATT AATCATGCAG 21000 
TGAAACCATG TTACGGTTCA TATGCAAAAC 21060 
TTGTAAAGCA ACAAAATGGA AAGCTAGAAA 21120 
CTGAGGCGAC CGCAGGCAAT GGTGATAACT 21180 
ATGTAGATAT AGAAACCCCA GACACTCATA 21240 
ACTCACGAGA ACTAATGGGC C AACAATCT A 21300 
GGGACAATTT TATTGGTCTA ATGTATTACA 21360 
GCCAAGCATC GCAGTTGAAT GCTGTTGTAG 21420 
ACCAGCTTTT GCTTGATTCC ATTGGTGATA 21480 
CTGTTGACAG CTATGATCCA GATGTTAGAA 21540 
TTCCAAATTA CTGCTTTCCA CTGGGAGGTG 21600 
AACCTAAAAC AGGTCAGGAA AATGGATGGG 2166 0 
ATGAAATAAG AGTTGGAAAT AATTTTGC C A 21720 
GAAATTTCCT GTACTCCAAC ATAGCGCTGT 21780 
CCAACGTAAA AATTTCTGAT AACCCAAACA 21840 
CTCCCGGGTT AGTGGACTGC TACATTAACC 21900 
ACAACGTCAA CCCATTTAAC CACCACCGCA 21960 
TGGGCAATGG TCGCTATGTG CCCTTCCACA 22020 
AAAACCTCCT TCTCCTGCCG GGCTCATACA 22080 
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CCTACGAGTG GAACTTCAGG 
ACCTAAGGGT TGACGGAGCC 
TCCCCATGGC CCACAACACC 
ACCAGTCCTT TAACGACTAT 
CTACCAACGT GCCCATATCC 
TCACGCGCCT T AAG ACT AAG 
CCTACTCTGG CTCTATACCC 
AGGTGGCCAT TACCTTTGAC 
CCAACGAGTT TGAAATTAAG 
ACATGACCAA AGACTGGTTC 
GCTTCTATAT C C C AG AG AG C 
CCATGAGCCG TCAGGTGGTG 
TACACCAACA CAACAACTCT 
AGGCCTACCC TGCTAACTTC 
CCCAGAAAAA GTTTCTTTGC 
TGTCCATGGG CGCACTCACA 
CGCTAGACAT GACTTTTGAG 
TTGAAGTCTT TGACGTGGTC 
ACCTGCGCAC GCCCTTCTCG 
AACAGCTGCC GC C ATGGGCT 
TTGTGGGCCA TATTTTTTGG 
CAAGCTCGCC TGCGCCATA G 
GGCCTTTGCC TGGAACCCGC 
TGACCAGCGA CTCAAGCAGG 
CATTGCTTCT TCCCCCGACC 
GCCCAACTCG GCCGCCTGTG 
GCCCCAAACT CCCATGGATC 
CATGCTCAAC AGTCCCCAGG 
CTTCCTGGAG CGCCACTCGC 
TTCTTTTTGT CACTTGAAAA 
CAAATGCTTT T ATTTGTACA 
CGTTTAAAAA TCAAAGGGGT 
GCGATACTGG TGTTTAGTGC 
GAAGTTTTCA CTCCACAGGC 
TATCTTGAAG TCGCAGTTGG 



AAGGATGTTA ACATGGTTCT 
AGCATTAAGT TTGATAGCAT 
GCCTCCACGC TTGAGGCCAT 
CTCTCCGCCG CCAACATGCT 
ATCCCCTCCC GCAAC.TGGGC 
GAAACCCCAT CACTGGGCTC 
TACCTAGATG GAACCTTTTA 
TCTTCTGTCA GCTGGCCTGG 
CGCTCAGTTG ACGGGGAGGG 
CTGGTACAAA TGCTAGCTAA 
T ACAAGG AC C GCATGTACTC 
GATGATACTA AATACAAGGA 
GGATTTGTTG GCTACCTTGC 
CCCTATCCGC TTATAGGCAA 
GATCGCACCC TTTGGCGCAT 
GACCTGGGCC AAAACCTTCT 
GTGGATCCCA TGGACGAGCC 
CGTGTGCACC GGCCGCACCG 
GCCGGCAACG CCACAACATA 
C C AG TG AG C A GGAACTGAAA 
GCACCTATGA CAAGCGCTTT 
TCAATACGGC CGGTCGCGAG 
ACTCAAAAAC ATGCTACCTC 
TTTACCAGTT TGAGTACGAG 
GCTGTATAAC GCTGGAAAAG 
GACTATTCTG CTGCATGTTT 
ACAACCCCAC CATGAACCTT 
TACAGCCCAC CCTGCGTCGC 
CCTACTTCCG CAGCCACAGT 
ACATGTAAAA ATAATGTACT 
CTCTCGGGTG ATTATTTACC 
TCTGCCGCGC ATCGCTATGC 
TCCACTTAAA CTCAGGCACA 
TGCGCACCAT CACCAACGCG 
GGCCTCCGCC CTGCGCGCGC 



GCAGAGCTCC CTAGGAAATG 
TTGCCTTTAC GCCACCTTCT 
GCTTAGAAAC GACACCAACG 
CTACCCTATA CCCGCCAACG 
GGCTTTCCGC GGCTGGGCC7 
GGGCTACGAC CCTTATTACA 
CCTCAACCAC ACCTTT AAG A 
CAATGACCGC CTGCTTACCC 
TTACAACGTT GCCCAGTGTA 
CTACAACATT GGCTACCAGG 
CTTCTTTAGA AACTTCCAGC 
CTACCAACAG GTGGGCATCC 
CCCCACCATG CGCGAAGGAC 
GACCGCAGTT G AC AG C ATT A 
CCCATTCTCC AGTAACTTTA 
CTACGCCAAC TCCGCCCACG 
CACCCTTCTT TATGTTTTGT 
CGGCGTCATC GAAACCGTGT 
AAG AAG C AAG CAACATCAAC 
GCCATTGTCA AAGATCTTGG 
CCAGGCTTTG TTTCTC C AC A 
ACTGGGGGCG TACACTGGAT 
TTTGAGCCCT TTGGCTTTTC 
TCACTCCTGC GCCGTAGCGC 
TCCACCCAAA GCGTACAGGG 
CTCCACGCCT TTGCCAACTG 
ATTACCGGGG TACCCAACTC 
AACCAGGAAC AG CTCTAC AG 
GCGCAGATTA GGAGCGCCAC 
AGAGACACTT TCAATAAAGG 
CCCACCCTTG CCGTCTGCGC 
GCCACTGGCA GGGACACGTT 
ACCATCCGCG GCAGCTCGGT 
TTTAGCAGGT CGGGCGCCGA 
GAGTTGCGAT ACACAGGGTT 
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GCAGCACTGG 

GATCAGATCC 
CTGCCTTCCC 
CAAAAGGTGA 
CTGCTTAAAA 
GGAAAACTGA 
GATCTGCACC 
. CTTCA GCGCG 
TATCATAATG 
CCACAACGCG 
GTACGCCTGC 
CTGCAACCCG 



TTGGTCAGGC 

CAGCGCGCGC 

GTTCATCACC 

CATACCACGC 

GCCATGCTTG 

TCTTTCTTCC 

AGGGCGCTTC 

CGGGCTGGGT 

GATACGCCGC 

GGACGACACG 

GGTTTCGCGC 

CATGGAGTCA 

CTCCACCGAT 

GGAGGAAGTG 

AGTAC CAACA 

CGGGCGGGGG 

GCATCTGCAG 

CCTCGCCATA 

CCCCAAACGC 

ATTTGCCGTG 

CCTATCCTGC 

TGTCATACCT 

CGACGAGAAG 



AACACTATCA 
GCGTCCAGGT 
AAAAAGGGCG 
CCGTGCCCGG 
GCCACGTGAG 
TTGGCCGGAC 
ACATTTCGGC 
CGCT GCCCGT 
CTTCCGTGTA 
CAGCCCGTGG 
AGGAATCGCC 
CGGTGCTCCT 
AGTAGTTTGA 
GCAGCCTCCA 
G T AATTTC AC 
GCCACTGGGT 
ATTAGCACCG 
TCGCTGTCCA 
TTTTTCTTCT 
GTGCGCGGCA 
CTCATCCGCT 
TCCTCCATGG 
TGCTCCTCTT 
GTCGAGAAGA 
GCCGCCAACG 
ATTATCGAGC 
GAGGATAAAA 
GACGAAAGGC 
CGCCAGTGCG 
GCGGATGTCA 
CAAGAAAACG 
CCAGAGGTGC 
CGTGCCAACC 
GATATCGCCT 
CGCGCGGCAA 



GCGCCGGGTG 

CCTCCGCGTT 

CGTGCCCAGG 

TCTGGGCGTT 

CCTTTGCGCC 

AGGCCGCGTC 

CC CACCGGTT 

TTTCGCTCGT 

GACACTTAAG 

GCTCGTGATG 

CCATCATCGT 

CGTTCAGCCA 

AGTTCGCCTT 

TGCCCTTCTC 

TTTCCGCTTC 

cgtcttcatt 

GTGGGTTGCT 
CGATTACCTC 
TGGGCGCAAT 
CCAGCGCGTC 
TTTTTGGGGG 
TTGGGGGACG 
CCCGACTGGC 
AGGACAGCCT 
CGCCTACCAC 
AGG ACCCAGG 
AGCAAGACCA 
ATGGCGACTA 
C C ATT ATCTG 
GCCTTGCCTA 
GCACATGCGA 
TTGCCACCTA 
GCAGCCGAGC 
CGCTCAACGA 
ACGCTCTGCA 



GTGCACGCTG 

GCTCAGGGCG 

CTTTGAGTTG 

AGGATACAGC 

TTCAGAGAAG 

GTGCACGCAG 

CTTCACGATC 

CACATCCATT 

CTCGCCTTCG 

CTTGTAGGTC 

CACAAAGGTC 

GGTCTTGCAT 

TAGATCGTTA 

CCACGCAGAC 

GCTGGGCTCT 

CAGCCGCCGC 

GAAACCCACC 

TGGTGATGGC 

GGCCAAATCC 

TTGTGATGAG 

CGCCCGGGGA 

TCGCGCCGCA 

CATTTCCTTC 

AACCGCCCCC 

CTTCCCCGTC 

TTTTGTAAGC 

GGACAACGCA 

CCTAGATGTG 

CGACGCGTTG 

CGAACGCCAC 

GCCCAACCCG 

TCACATCTTT 

GGACAAGCAG 

AGTGCCAAAA 

ACAGGAAAAC 
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GCCAGCACGC 

AACGGAGTCA 

CACTCGCACC 

GCCTGCATAA 

AACATGCCGC 

CACCTTGCGT 

TTGGCCTTGC 

TCAATCACGT 

ATCTCAGCGC 

ACCTCTGCAA 

TTGTTGCTGG 

ACGGCCGCCA 

TCCACGTGGT 

ACGATCGGCA 

TCCTCTTCCT 

ACTGTGCGCT 

ATTTGTAGCG 

GGGCGCTCGG 

GCCGCCGAGG 

TCTTCCTCGT 

GGCGGCGGCG 

CCGCGTCCGC 

TCCTATAGGC 

TCTGAGTTCG 

GAGGCACCCC 

GAAGACGACG 

GAGGCAAACG 

GGAGACGACG 

CAAGAGCGCA 

CTATTCTCAC 

CGCCTCAACT 

TTCCAAAACT 

CTGGCCTTGC 

ATCTTTGAGG 

AGCGAAAATG 



TCTTGTCGGA 

ACTTTGGTAG 

GTAGTGGCAT 

AAGCCTTGAT 

AAGACTTGCC 

CGGTGTTGGA 

T AG ACTGCTC 

GCTCCTTATT 

AGCGGTGCAG 

ACGACTGCAG 

TGAAGGTCAG 

GAG CTTC C AC 

ACTTGTCCAT 

CACTCAGCGG 

CTTGCGTCCG 

TACCTCCTTT 

CCACATCTTC 

GCTTGGGAGA 

TCGATGGCCG 

CCTCGGACTC 

ACGGGGACGG 

GCTCGGGGGT 

AGAAAAAGAT 

CCACCACCGC 

CGCTTGAGGA 

AGGACCGCTC 

AGGAACAAGT 

TGCTGTTGAA 

GCGATGTGCC 

CGCGCGTACC 

TCTACCCCGT 

GCAAGATACC 

GGCAGGGCGC 

GTCTTGGACG 

AAAGTCACTC 
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TGGAGTGTTG 
CGAGGTCACC 
CATGAGTGAG 
ACAAACAGAG 
GCGCGAGCCT 
CGTGGAGCTT 
GGAAACATTG 
CGTGGAGCTC 
AAACGTGCTT 
TTACTTATTT 
GGAGTGCAAC 
GGCCTTCAAC 
GCTTAAAACC 
TAGGAACTTT 
CGACTTTGTG 
TCTGCAGCTA 
CGGTCTACTG 
CAATTCGCAG 
GC CTGACGAA 
TTACCTTCGC 
CCAATCCCGC 
TGGCCAATTG 
GGTTTACTTG 
CTATCAGCAG 
TGCCGCCGCC 
ACGAGGAGGA 
TCGAAGAGGT 
AGAAATCGGC 
TGCCCGTTCG 
AGCAGCCGCC 
GGCACAAGAA 
GCCGCTTTCT 
GTCATCTCTA 
ACACAGAAGC 
GCGGCAGCAG 



GTGGAACTCG 

CACTTTGCCT 

CTGATCGTGC 

GAGGGCCTAC 

GCCGACTTGG 

GAGTGCATGC 

CACTACACCT 

TGCAACCTGG 

CATTCCACGC 

CTATGCTACA 

CTCAAGGAGC 

GAGCGCTCCG 

CTGCAACAGG 

ATCCTAGAGC 

CCCATTAAGT 

GCCAACTACC 

GAGTGTCACT 

CTGCTTAACG 

AAGTCC.GCGG 

AAATTTGTAC 

CCGCCAAATG 

CAAGCCATCA 

GACCCCCAGT 

CAGCCGCGGG 

ACCCACGGAC 

GGAGGACATG 

GTCAGACGAA 

AACCGGTTCC 

CCGACCCAAC 

GCCGTTAGCC 

CGCCATAGTT 

TCTCTACCAT 

CAGCCCATAC 

AAAGGCGACC 

CAGGAGGAGG 



AGGGTGACAA 

ACCCGGCACT 

GCCGTGCGCA 

CCGCAGTTGG 

AGGAGCGACG 

AGCGGTTCTT 

TTCGACAGGG 

TCTCCTACCT 

TCAAGGGCGA 

CCTGGCAGAC 

TGCAGAAACT 

TGGCCGCGCA 

GTCTGCCAGA 

GCTCAGGAAT 

ACCGCGAATG 

TTGCCTACCA 

GTCGCTGCAA 

AAAGTCAAAT 

CTCCGGGGTT 

CTGAGGACTA 

CGGAGCTTAC 

ACAAAGCCCG 

CCGGCGAGGA 

CCCTTGCTTC 

GAGGAGGAAT 

ATGGAAGACT 

ACACCGTCAC 

AGCATGGCTA 

CGTAGATGGG 

CAAGAGCAAC 

GCTTGCTTGC 

CACGGCGTGG 

TGCACCGGCG 

GGATAGCAAG 

AGCGCTGCGT 



CGCGCGCCTA 
TAACCTACCC 
GCCCCTGGAG 
CGACGAGCAG 
CAAACT AATG 
TGCTGACCCG 
CTACGTACGC 
TGGAATTTTG 
GGCGCGCCGC 
GGCCATGGGC 
GCTAAAGCAA 
CCTGGCGGAC 
CTTCACCAGT 
CTTGCCCGCC 
CCCTCCGCCG 
CTCTGACATA 
CCTATGCACC 
TATCGGTACC 
GAAACTCACT 
CCACGCCCAC 
CGCCTGCGTC 
CCAAGAGTTT 
GCTCAACCCA 
CCAGGATGGC 
ACTGGGACAG 
GGGAGAGCCT 
CCTCGGTCGC 
CAACCTCCGC 
ACACCACTGG 
AACAGCGCCA 
AAGACTGTGG 
CCTTCCCCCG 
GCAGCGGCAG 
ACTCTGACAA 
CTGGCGCCCA 



GCCGTACTAA 

CCCAAGGTCA 

AGGGATGCAA 

CTAGCGCGCT 

ATGGCCGCAG 

GAGATGCAGC 

CAGGCCTGCA 

CACGAAAACC 

GACTACGTCC 

GTTTGGCAGC 

AACTTGAAGG 

ATCATTTTCC 

CAAAGCATGT 

ACCTGCTGTG 

CTTTGGGGCC 

ATGGAAGACG 

CCGCACCGCT 

TTTGAGCTGC 

CCGGGGCTGT 

GAGATTAGGT 

ATTACCCAGG 

CTGCTACGAA 

ATCCCCCCGC 

ACCCAAAAAG 

TCAGGCAGAG 

AGACGAGGAA 

ATTCCCCTCG 

TCCTCAGGCG 

AACCAGGGCC 

AGGCTACCGC 

GGGC AACATC 

TAACATCCTG 

CGGCAGCAAC 

AGCCCAAGAA 

ACGAACCCGT 



AACGCAGCAT 

TGAGCACAGT 

ATTTGCAAGA 

GGCTTCAAAC 

TGCTCGTTAC 

GCAAGCTAGA 

AGATCTCCAA 

GCCTTGGGCA 

GCGACTGCGT 

AGTG CTTGG A 

ACCTATGGAC 

CCGAACGCCT 

TGCAGAACTT 

CACTTCCTAG 

ACTGCTACCT 

TGAGCGGTGA 

CCCTGGTTTG 

AGGGTCCCTC 

GGACGTCGGC 

TCTACGAAGA 

GCCACATTCT 

AGGGACGGGG 

CGCCGCAGCC 

AAGCTGCAGC 

GAGGTTTTGG 

GCTTCCGAGG 

CCGGCGCCCC 

CCGCCGGCAC 

GGTAAGTCCA 

TCATGGCGCG 

TCCTTCGCCC 

CATTACTACC 

AGCAGCGGCC 

ATCCACA GCG 

ATCGACCCGC 
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GAGCT7AGAA ACAGGATTTT TCCCACTCTG TATGCTAT AT TTCAACAGAG CAGGGGCCAA 2 8 440 

GAACAAGAGC TGAAAATAAA AAACAGGTCT CTGCGATCCC TCACCCGCAG CTGCCTGTAT 28500 

CACAAAAGCG AAGATCAGCT TCGGCGCACG CTGGAAGACG CGGAGGCTCT CTTCAGTAAA 28560 

TACTGCGCGC TGACTCTTAA GGACTAGTTT CGCGCCCTTT CTCAAATTTA AGCGCGAAAA 28620 

CTACGTCATC TCCAGCGGCC ACACCCGGCG CCAGCACCTG TCGTCAGCGC CATTATG AG C 2 86 80 

AAGGAAATTC CCACGCCCTA CATGTGGAGT TACCAGCCAC AAATGGGACT TGCGGCTGGA 28740 

GCTGCCCAAG ACTACTCAAC CCGAATAAAC T ACATGAGCG CGGGACCCCA CATGATATCC 2 88 00 

CGGGTCAACG G AATCCGCG C CCACCGAAAC CGAATTCTCT TGGAACAGGC GGCTATTACC 2 8860 

ACCACACCTC GTAATAACCT TAATCCCCGT AGTTGGCCCG CTGCCCTGGT GTACCAGGAA 28920 

AGTCCCGCTC GCACCACTGT GGTACTTCCC AGAGACGCCC AGGCCGAAGT TCAGATGACT 28980 

AACTCAGGGG CGCAGCTTGC GGGCGGCTTT CGTCACAGGG TGCGGTCGCC CGGGCAGGGT 29040 

ATAACTCACC TGACAATCAG AGGGCGAGGT ATTCAGCTCA ACGACGAGTC GGTGAGCTCC 29100 

TCGCTTGGTC TCCGTCCGGA CGGGACATTT CAGATCGGCG GCGCCGGCCG TCCTTCATTC 29160 

ACGCCTCGTC AGGCAATCCT AACTCTGCAG ACCTCGTCCT CTGAGCCGCG CTCTGGAGGC 29220 

ATTGGAACTC TGCAATTTAT TGAGGAGTTT GTGCCATCGG TCTACTTTAA CCCCTTCTCG 29280 

GGACCTCCCG GCCACTATCC GGATCAATTT ATTC CTAACT TTGACGCGGT AAAGGACTCG 2 934 0 

GCGGACGGCT ACGACTGAAT GTTAATTAAG TTCCTGTCCA TCCGCACCCA CTATCTTCAT 29400 

GTTGTTGCAG ATGAAGCGCG CAAGACCGTC TGAAGATACC TTCAACCCCG TGTATCCATA 29460 

TGACACGGAA ACCGGTCCTC CAACTGTG CC TTTTCTTACT CCTCCCTTTG TATCCCCCAA 29520 

TGGGTTTCAA GAGAGTCCCC CTGGGGTACT CTCTTTGCGC CTATCCGAAC CTCTAGTTAC 29580 

CTCCAATGGC ATGCTTGCGC TCAAAATGGG CAACGGCCTC TCTCTGGACG AGGCCGGCAA 29640 

CCTTACCTCC CAAAATGTAA CCACTGTGAG CCCACCTCTC AAAAAAACCA AGTCAAACAT 29700 

AAACCTGGAA ATATCTGCAC CCCTCACAGT TACCTCAGAA GCCCTAACTG TGGCTGCCGC 29760 

CGCACCTCTA ATGGTCGCGG GCAACACACT CACCATGCAA TCACAGGCCC CGCTAACCGT 29820 

GCACGACTCC AAACTTAGCA TTGCCACCCA AGGACCCCTC ACAGTGTCAG AAGGAAAGCT 29880 

AGCCCTGCAA ACATCAGGCC CCCTCACCAC CACCGATAGC AGTACCCTTA CTATCACTGC 29940 

CTCACCCCCT CTAACTACTG CCACTGGTAG CTTGGGCATT GACTTGAAAG AGCCCATTTA 30000 

TACACAAAAT GGAAAACTAG GACTAAAGTA CGGGGCTCCT TTGCATGTAA CAGACGACCT 30060 

AAACACTTTG ACCGTAG C AA CTGGTCCAGG TGTGACTATT AATAATACTT CCTTGCAAAC 30120 

TAAAGTTACT GGAGCCTTGG GTTTTGATTC ACAAGGCAAT ATGCAACTTA ATGTAGCAGG 30180 

AGGACTAAGG ATTGATTCTC AAAACAGACG CCTTATACTT GATGTTAGTT ATCCGTTTGA 30240 

TGCTCAAAAC CAACTAAATC TAAGACTAGG ACAGGGCCCT CTTTTTATAA ACTCAGCCCA 30300 

CAACTTGGAT ATTAACTACA ACAAAGG C CT TTACTTGTTT ACAGCTTCAA ACAATTCCAA 3036 0 

AAAGCTTGAG GTTAACCTAA GCACTGCCAA GGGGTTGATG TTTGACGCTA CAGCCATAGC 30420 

CATTAATGCA GGAGATGGGC TTGAATTTGG TTCACCTAAT GCACCAAACA CAAATCCCCT 30480 
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CAAAACAAAA ATTGGCCATG GCCTAGAATT TGATTCAAAC AAGGCTATGG TTCCTAAACT 30S40 

AGGAACTGGC CTTAGTTTTG ACAGCACAGG TGCCATTACA GTAGGAAACA AAAATAATGA 30600 

TAAGCTAACT TTGTGGACCA CACCAGCTCC ATCTCCTAAC TGT AGACTAA ATGCAGAGAA 30660 

AGATGCTAAA CTCACTTTGG TCTTAACAAA ATGTGGCAGT CAAATACTTG CTACAGTTTC 30720 

AGTTTTGG CT GTTAAAGGCA GTTTGGCTCC AATATCTGGA ACAGTTCAAA GTGCTCATCT 3 0780 

TATTATAAGA TTTGACGAAA ATGGAGTGCT ACTAAACAAT TCCTTCCTGG ACCCAGAATA 30840 

TTGGAACTTT AGAAATGGAG ATCTTACTGA AGGCACAGCC TATACAAACG CTGTTGGATT 30900 

TATGCCTAAC CTATCAGCTT ATCCAAAATC TCACGGTAAA ACTGCCAAAA GTAACATTGT 30960 

CAGTCAAGTT TACTTAAACG GAGACAAAAC TAAACCTGTA ACACTAACCA TTACACTAAA 31020 

CGGTACACAG GAAACAGGAG ACACAACTCC AAGTGCATAC TCTATGTCAT TTTCATGGGA 31080 

CTGGTCTGGC CACAACTACA TTAATGAAAT ATTTGCCACA TCCTCTTACA CTTTTTCATA 31140 

CATTGCCCAA GAATAAAGAA TCGTTTGTGT TATGTTTCAA CGTGTTTATT TTTCAATTGC 31200 

AGAAAATTTC AAGTCATTTT TCATTCAGTA GTATAGCCCC ACCACCACAT. AGCTTATACA 3126 0 

GATCACCGTA CCTTAATCAA ACTCACAGAA CCCTAGTATT CAACCTGCCA CCTCCCTCCC 31320 

AACACACAGA GTACACAGTC CTTTCTCCCC GGCTGGCCTT AAAAAGCATC ATATCATGGG 31380 

TAACAGACAT ATT CTT AGGT GTTATATTCC ACACGGTTTC CTGTCGAGCC AAACGCTCAT 3144 0 

CAGTGATATT AATAAACTCC CCGGGCAGCT CACTTAAGTT CATGTCGCTG TCCAGCTGCT 31500 

GAGCCACAGG CTGCTGTCCA ACTTGCGGTT GCTTAACGGG CGGCGAAGGA GAAGTCCACG 31560 

CCTACATGGG GGTAGAGTCA TAATCGTGCA TCAGGATAGG GCGGTGGTGC TGCAGCAGCG 31620 

CGCGAATAAA CTGCTGCCGC CGCCGCTCCG TCCTGCAGGA ATACAACATG GCAGTGGTCT 31680 

CCTCAGCGAT GATTCGCACC GCCCGCAGCA TAAGGCGCCT TGTCCTCCGG GCACAGCAGC 31740 

GCACCCTGAT CTCACTTAAA TCAG CACAGT AACTGCAGCA CAGCACCACA ATATTGTTCA 31800 

AAATCCCACA GTGCAAGGCG CTGTATCCAA AGCTCATGGC GGGGACCACA GAACCCACGT 31860 

GGCCATCATA CCACAAGCGC AGGTAGATTA AGTGGCGACC CCTCATAAAC ACGCTGGACA 31920 

TAAACATTAC CTCTTTTGGC ATGTTGTAAT TCACCACCTC C CGGT AC CAT ATAAACCTCT 31980 

GATT AAAC AT GGCGCCATCC ACCACCATCC TAAACCAGCT GGCCAAAACC TGCCCGCCGG 3 2 040 

CTATACACTG CAGGGAACCG GGACTGGAAC AATGACAGTG GAGAGCCCAG GACTCGTAAC 32100 

CATGGATCAT CATGCTCGTC ATGATATCAA TGTTGGCACA ACACAGGCAC ACGTGCATAC 32160 

ACTTCCTCAG GATTACAAGC TCCTCCCGCG TTAGAACCAT ATCCCAGGGA ACAACCCATT 32220 

CCTGAATCAG CGTAAATCCC ACACTGCAGG GAAGACCTCG CACGTAACTC ACGTTGTGCA 32280 

TTGTCAAAGT GTTACATTCG GGCAGCAGCG GATGATCCTC CAGTATGGTA GCGCGGGTTT 32340 

CTGTCTCAAA AGGAGGTAGA CGATCCCTAC TGTACGGAGT GCGCCGAGAC AACCGAGATC 32400 

GTGTTGGTCG TAGTGTCATG CCAAATGGAA CGCCGGACGT AGTCATATTT CCTGAAGCAA 32460 

AACCAGGTGC GGGCGTGACA AACAGATCTG CGTCTCCGGT CTCGCCGCTT AGATCGCTCT 32520 

GTGTAGTAGT TGTAGTATAT CCACTCTCTC AAAGCATCCA GGCGCCCCCT GGCTTCGGGT 32580 
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TCTATGTAAA CTCCTTCATG CGCCGCTGCC CTGATAACAT 
ACACCCAGCC AACCTACACA TTCGTTCTGC GAGTCACACA 
GGAAGAACCA TGTTTTTTTT TTTATTCCAA AAGATTATCC 
TATTAAGTGA ACGCGCTCCC CTCCGGTGGC GTGGTCAAAC 
AATGGCATTT GTAAGATGTT GCACAATGGC TTCCAAAAGG 
GTGGACGTAA AGGCTAAACC CTTCAGGGTG AATCTCCTCT 
AACCATGCCC AAATAATTCT CATCTCGCCA CCTTCTCAAT 
AATATTAAGT CCGGCCATTG TAAAAATCTG CTCCAGAGCG 
GCAGCGAATC ATGATTGCAA AAATTCAGGT TCCTCACAGA 
GGAACATTAA CAAAAATACC GCGATCCCGT AGGTCCCTTC 
TCGTGCAGGT CTGCACGGAC CAGCGCGGCC ACTTCCCCGC 
CCCACACTGA TTATGACACG CATACTCGGA GCTATGCTAA 
GCTTTGTTGC ATGGGCGGCG ATATAAAATG CAAGGTGCTG 
CTCGCGCAAA AAAGAAAGCA CATCGTAGTC ATGCTCATGC 
CGGAACCACC ACAGAAAAAG ACACCATTTT TCTCTCAAAC 
AAACACAAAA TAAAATAACA AAAAAACATT TAAACATTAG 
AAAACAACCC TTATAAGCAT AAGACGGACT ACGGCCATGC 
TGGTCACCGT GATT AAAAAG CACCACCGAC AGCTCCTCGG 
TAAGACTCGG TAAACACATC AGGTTGATTC ATCGGTCAGT 
GCCCGGGGGA ATACATACCC GCAGGCGTAG AGACAACATT 
AACAAAATTA ATAGGAGAGA AAAACACATA AACACCTGAA 
AATAGCACCC TCCCGCTCCA GAACAACATA CAGCGCTTCA 
GCCTTACCAG TAAAAAAGAA AACCTATTAA AAAAACACCA 
ATCAGTCACA GTGTAAAAAA GGGCCAAGTG C AG AG CG AG T 
ACGTAACGGT TAAAGTCCAC AAAAAACACC CAGAAAACCG 
AACGAAAGCC AAAAAACCCA CAACTTCCTC AAATCGTCAC 
TAACTTCCCA TTTTAAGAAA ACTACAATTC CCAACACATA 
CCTACGTCAC CCGCCCCGTT CCCACGCCCC GCGCCACGTC 
ATCATATTGG CTTCAATCCA AAATAAGGTA TATTATTGAT 
(2) INFORMATION FOR SEQ ID NO: 5: 



CCACCACCGC 
CGGGAGGAGC 
AAAACCTCAA 
TCTACAGCCA 
CAAACGGCCC 
ATAAACATTC 
ATATCTCTAA 
CCCTCCACCT 
CCTGTATAAG 
GCAGGGCCAG 
CAGGAACCTT 
CCAGCGTAGC 
CTCAAAAAAT 
AGATAAAGGC 
ATGTCTGCGG 
AAGCCTGTCT 
CGGCGTGACC 
TCATGTCCGG 
GCTAAAAAGC 
ACAGCCCCCA 
AAACCCTCCT 
CAGCGGCAGC 
CTCG AC ACGG 
AT AT AT AGG A 
CACGCGAACC 
TTCCGTTTTC 
CAAGTTACTC 
ACAAACTCCA- 
GAT 
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AGAATAAGCC 32640 
GGGAAGAGCT 32700 
AATGAAGATC 32760 
AAGAACAGAT 32820 
TCACGTCCAA 32880 
CAGCACCTTC . 32940 
GCAAATCCCG 33000 
TCAGCCTCAA 33060 
ATTCAAAAGC 33120 
CTGAACATAA 33180 
GACAAAAGAA 33240 
CCCGATGTAA 33300 
CAGGCAAAGC 33360 
AGGTAAGCTC 33420 
GTTTCTGCAT 33480 
TACAACAGGA 33540 
GTAAAAAAAC 3 3 600 

AGTCATAATG 33660 
G ACCG AAATA 33720 

T AGG AGG TAT 33780 

GCCTAGGCAA 33840 
CTAACAGTCA 33900 
CACCAGCTCA 33960 
CTAAAAAATG 34020 
TACGCCCAGA 34080 
CCACGTTACG 34140 
CGCCC TAAAA 34200 

CCCCCTCATT 34260 
34303 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 380 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = "DNA" 

ixi) SEQUENCE DESCRIPTION: SEQ ID NO : S : 
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CCAGGCTTTA CACT7TATGC TTCCGGCTCG TATGTTGTGT GGAATTGTGA GCGGATAACA 60 

ATTTCACACA GGAAACAGCT ATGACCATGA TTACGCCAAG CGCGCAATTA ACCCTCACTA 120 

AAGGGAACAA AAGCTGGGTA CCGGGCCCCC CCTCGAGGTC GACGGTATCG ATAAGCTTAC 180 

GCGTGGCCTA GGCGGCCGAA TTCCTGCAGC GCGGGGGATC CACTAGTTCT AGAGCGGCCG 240 

CCACCGCGGC GCCTTAATTA ATACGTAAGC TCCAATTCGC CCTATAGTGA GTCGTATTAC 300 

GCGCGCTCAC TGGCCGTCGT TTTACAACGT CGTGACTGGG AAAACCCTGG CGTTACCCAA 360 

CTTAATCGCC TTGCAGCACA 380 

( 2 ) INFORMATION FOR SEQ ID NO : 6 : 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "DNA" 

(xii SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 

TGCCGCAGCA CCGGATGCAT C 21 

(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = "DNA" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 

GCGTCCGGAG GCTGCCATGC GGCAGGG 27 

(2) INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1481 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = "DNA" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 : 



GGCTGCCATG 


CGGCAGGGAT 


ACGGCGCTAA 


CGATGCATCT 


CAACAATTGT 


TGTGTAGGTA 


60 


CTCCGCCGCC 


GAGGGACCTG 


AGCGAGTCCG 


CATCGACGGG 


ATCGGAAAAC 


CTCTCGAGAA 


120 


AGGCGTCTAA 


CCAGTCACAG 


TCGCAAGGTA 


GGCTGAGCAC 


CGTGGCGGGC 


GGCAGCGGGC 


180 


GGCGGTCGGG 


GTTGTTTCTG 


GCGGAGGTGC 


TGCTGATGAT 


GTAATTAAAG 


TAGGGGGTCT 


240 


TGAGA CGGCG 


GATGGTCGAC 


AGAAGCACCA 


TGTCCTTGGG 


TCCGGCCTGC 


TGAATGCGCA 


300 


GGCGGTCGGC 


CATGCCCCAG 


GCTTCGTTTT 


GACATCGGCG 


CAGGTCTTTG 


TAGTAGTCTT 


360 


GCATGAGCCT 


TTCTACCGGC 


ACTTCTTCTT 


CTCCTTCCTC 


TTGTCCTGCA 


TCTCTTGCAT 


420 
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CTATCGCTGC 


GGCGGt-vjvjCG 


GAGTTTGGCC 


gtaggtggcg 


CCCTCTTCCT 


CCCATGCGTG 


480 


TGACCCCGAA 


GCCCCTCATC 


GGCTGAAGCA 


GGGCTAGGTC 


GGCGACAACG 


cgctcggcta 


540 


ATATGGCCTG 


CTGCACCTGC 


GTGAGGGTAG 


ACTGGAAGTC 


atccatgtcc 


ACAAAGCGGT 


600 


GGTATGCGCC 


CGTGTTGATG 


GTGTAAGTGC 


AGTTGGCCAT 


aacggaccag 


TTAACGGTCT 


660 


GGTGACCCGG 


CTGCGAGAGC 


TCGGTGTACC 


tgagacgcga 


GTAAGCCCTC 


gagtcaaata 


720 


CGTAGTCGTT 


GCAAGTCCGC 


ACCAGGTACT 


GGTATCCCAC 


CAAAAAGTGC 


ggcggcggct 


780 


GGCGGTAGAG 


GGGCCAGCGT 


AGGGTGGCCG 


GGGCTCCGGG 


GGCGAGATCT 


tccaacataa 


840 


GG CGATGATA 


TCCGTAGATG 


TACCTGGACA 


TCCAGGTGAT 


GCCGGCGGCG 


gtggtggagg 


900 


CGCGCGGAAA 


GTCGCGGACG 


CGGTTCCAGA 


TGTTGCGCAG 


CGGCAAAAAG 


tgctccatgg 


960 


TCGGGACGCT 


CTGGCCGGTC 


AGGCGCGCGC 


AATCGTTGAC 


gctctaccgt 


gcaaaaggag 


1020 


AGCCTGTAAG 


CGGGCACTCT 


TCCGTGGTCT 


GGTGGATAAA 


ttcgcaaggg 


tatcatggcg 


1080 


GACGACCGGG 


GTTCGAGCCC 


CGTATCCGGC 


CGTCCGCCGT 


gatccatgcg 


gttaccgccc 


1140 


GCGTGTCGAA 


CCCAGGTGTG 


CGACGTCAGA 


CAACGGGGGA 


GTGCTCCTTT 


TGGCTTCCTT 


1200 


CCAGGv-GCGG 


CGGCTGCTGC 


gctagctttt 


TTGGCCACTG 


gccgcgcgca 


GCGTAAGCGG 


1260 


TTAGGCTGGA 


AAGCGAAAGC 


ATTAAGTGGC 


TCGCTCCCTG 


tagccggagg 


GTTATTTTCC 


1320 


AAGGGTTGAG 


TCGCGGGACC 


CCCGGTTCGA 


GTCTCGGACC 


ggccggactg 


cggcgaacgg 


1380 


GGGTTTGCCT 


CCCCGTCATG 


CAAGACCCCG 


CTTGCAAATT 


cctccggaaa 


cagggacgag 


1440 


CCCCTTTTTT 


GCTTTTCCCA 


GATGCATCCG 


GTGCTGCGGC 


A 




1481 



(2) INFORMATION FOR SEQ ID NO : 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3364 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(li) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = "DNA" 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 : 



GAATTCCCCA TCCTGGTCTA TAGAGAGAGT TCCAGAACAG CCAGGGCTAC AGATAAACCC 60 
ATCTGGAAAA ACAAAGTTGA ATGACCCAAG AGGGGTTCTC AGAGGGTGGC GTGTCCTCCC .120 
TGGCAAGCCT ATGACATGGC CGGGGCCTGC CTCTCTCTGC CTCTGACCCT CAGTGGCTCC 180 
CATGAACTCC TTGCCCAATG GCATCTTTTT CCTGCGCTCC TTGGGTTATT CCAGTCTCCC 240 
CTCAGCATTC CTTCCTCAGG GCCTCGCTCT TCTCTCTGCT CCCTCCTTGC ACAGCTGGCT 300 
CTGTCCACCT CAGATGTCAC AGTGCTCTCT CAGAGGAGGA AGGCACCATG TACCCTCTGT 360 
TTCCCAGGTA AGGGTTCAAT TTTTAAAAAT GGTTTTTTGT TTGTTTGTTT GTTTGTTTGT 420 
TTGTTTGTTT TTCAAGACAG GGCTCCTCTG TGTAGTCCTA ACTGTCTTGA AACTCCCTCT 480 
GTAGACCAGG TCGACCTCGA ACTCTTGAAA CCTGCCACGG ACCACCCAGT CAGGTATGGA 540 
GGTCCCTGGA ATGAGCGTCC TCGAAGCTAG GTGGGTAAGG GTTCGGCGGT GACAAACAGA 600 
AACAAACACA GAGGCAGTTT GAATCTGAGT GTATTTTGCA GCTCTCAAGC AGGGGATTTT 660 
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ATACATAAAA 


AAAAAAAAAA 


AAAAAAAACC 


AAAC ATT AC A 


TCTCTTAGAA 
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ACTATATCCA 720 


ATGAAACAAT 


CACAGATACC 


AACCAAAACC 


ATTGGGCAGA 


GTAAAGCACA 


AAAATCATCC 


780 


AAGCATTACA 


ACTCTGAAAC 


CATGTATTCA 


GTGAATCACA 


AACAGAACAG 


GTAACATCAT 


840 


TATTAATATA 


AATCACCAAA 


ATATAACAAT 


TCTAAAAGGA 


TGTATCCAGT 


GGGGGCTGTC 


900 


GTCCAAGGCT 


AGTGGCAGAT 


TTCCAGGAGC 


AGGTTAGTAA 


ATCTTAACCA 


CTGAACTAAC 


960 


TCTCCAGCCC 


CATGGTCAAT 


TATTATTTAG 


CATCTAGTGC 


CTAATTTTTT 


TTT AT AAATC 


1020 


TTC ACT ATGT 


AATTTAAAAC 


TATTTTAATT 


CTTCCTAATT 


AAGGCTTTCT 


TTACCATATA 


1080 


CCAAAATTCA 


CCTCCAATGA 


CACACGCGTA 


GCCATATGAA 


ATTTTATTGT 


TGGGAAAATT 


1140 


TGTACCTATC 


AT AATAG TTT 


TGTAAATGAT 


TTAAAAAGCA 


AAGTGTTAGC 


CGGGCGTGGT 


1200 


GGCACACGCC 


TTTAATCCCT 


GCACTCGGGA 


GGCAGGGGCA 


GGAGGATTTC 


TGAGTTTGAG 


1260 


GCCAGCCTGG 


TCTACAGAGT 


GAGTTCCAGG 


ACAGCCAGGG 


CTACACAGAG 


AAACCCTGTC 


1320 


TCGAACCCCC 


CACCCCCCAA 


AAAAAGCAAA 


GTGTTGGTTT 


C CTTGGGG AT 


AAAGTCATGT 


1380 


TAGTGGCCCA 


TCTCTAGGCC 


CATCTCACCC 


ATTATTCTCG 


CTTAAGATCT 


TGGCCTAGGC 


1440 


TACCAGGAAC 


ATGTAAATAA 


GAAAAGGAAT 


AAGAGAAAAC 


AAAACAGAGA 


GATTGCCATG 


1500 


AGAACTACGG 


CTCAATATTT 


TTTCTCTCCG 


GCGAAGAGTT 


CCACAACCAT 


CTCCAGGAGG 


1560 


CCTCCACGTT 


TTGAGGTCAA 


TGGCCTCAGT 


CTGTGGAACT 


TGTCACACAG 


ATCTTACTGG 


1620 


AGGTGGTGTG 


GCAGAAACCC 


ATTCCTTTTA 


GTGTCTTGGG 


CTAAAAGTAA 


AAGGCCCAGA 


1680 


GGAGGCCTTT 


GCTCATCTGA 


CCATGCTGAC 


AAGGAACACG 


GGTGCCAGGA 


CAGAGGCTGG 


1740 


ACCCCAGGAA 


CACCTTAAAC 


ACTTCTTCCC 


TTCTCCGCCC 


C CT AG AG C AG 


GCTCGCCTCA 


1800 


CCAGCCTGGG 


CAGAAATGGG 


GGAAGATGGA 


GTGAAGCCAT 


ACTGGCTACT 


CCAGAATCAA 


1860 


CAGAGGGAGC 


CGGGGGCAAT 


ACTGGAGAAG 


CTGGTCTCCC 


C CC AGGGGC A 


ATCCTGGCAC 


1920 


CTCCCAGGCA 


GAAGAGGAAA 


CTTCCACAGT 


GCATCTCACT 


TCCATGAATC 


CCCTCCTCGG 


1980 


ACTCTGAGGT 


C CTTGGTCAC 


AGCTGAGGTG 


CAAAAGGCTC 


CTGTCATATT 


GTGTCCTGCT 


2040 


CTGGTCTGCC 


TTCCACAGCT 


TGGGGGCCAC 


CTAGCCCACC 


TCTCCCTAGG 


GATGAGAGCA 


2100 


GCCACTACGG 


GTCTAGGCTG 


CCCATGTAAG 


GAGGCAAGGC 


CTGGGGACAC 


CCGAGATGCC 


2160 


TGGTTATAAT 


TAACCCAGAC 


ATGTGGCTGC 


CCCCCCCCCC 


CCAACACCTG 


CTGCCTGAGC 


2220 


CTCACCCCCA 


CCCCGGTGCC 


TGGGTCTTAG 


GCTCTGTACA 


CCATGGAGGA 


GAAGCTCGCT 


2280 


CTAAAAATAA 


CCCTGTCCCT 


GGTGGATCCA 


GGGTGAGGGG 


CAGGCTGAGG 


GCGGCCACTT 


2340 


CCCTCAGCCG 


CAGGTTTGTT 


TTCCCAAGAA 


TGGTTTTTCT 


G CTTCTGTAG 


CTTTTCCTGT 


2400 


CAATTCTGCC 


ATGGTGGAGC 


AGCCTGCACT 


GGGCTTCTGG 


GAG AAAC C AA 


ACCGGGTTCT 


2460 


AACCTTTCAG 


CTACAGTTAT 


TGCCTTTCCT 


GTAGATGGGC 


GACTACAGCC 


CCACCCCCAC 


2520 


CCCCGTCTCC 


TGTATCCTTC 


CTGGGCCTGG 


GGATCCTAGG 


CTTTCACTGG 


AAATTTCCCC 


2580 


CCAGGTGCTG 


TAGGCTAGAG 


TCACGGCTCC 


CAAGAACAGT 


GCTTGCCTGG 


CATGCATGGT 


2640 


TCTGAACCTC 


CAACTGCAAA 


AAATGACACA 


TACCTTGACC 


CTTGGAAGGC 


TGAGGCAGGG 


2700 


GGATTGCCAT 


GAGTGCAAAG 


CCAGACTGGG 


TGGCATAGTT 


AGACCCTGTC 


TCAAAAAACC 


2760 
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AAAAACAATT 

GATTGTTACA 

AGAGTAAGGC 




AAATAACTAA 

TGTCTGAGGC 

CCTATTTCAA 



AGTCAGGCAA 

CAGCCTGGAC 

AAACACAAAC 



GTAATCCTAC 

TACATAGGGT 

AAAATGGTTC 



CAGGCAATGA AGCCTGGTGA GCATTAGCAA TGAAGGCAAT 
ATCAAGGCTG TGGGGGACTG AGGGCAGGCT GTAACAGGCT 
TGCCTGGGAC TCCCAAAGTA TTACTGTTCC ATGTTCCCGG 
GCCAGCTAGA CTCAGCACTT AGTTTAGGAA CCAGTGAGCA 
CATACAAGGC CATGGGGCTG GGCAAGCTGC ACGCCTGGGT 
GGGCAACGAG CTGAAAGCTC ATCTGCTCTC AGGGGCCCCT 
GGCTAGTCAC ACCCTGTAGG CTCCTCTATA TAACCCAGGG 
TCAC 

(2) INFORMATION FOR SEQ ID NO: 10: 

ii) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 174 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 
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TCGGGAGACT 


GAGGCAGAGG 


2820 


TT C AGG CT AG 


CCCTGTCTAC 


2880 


TCCCAGCTGC 


TAATGCTCAC 


2940 


GAAGGAGGGT 


GCTGGCTACA 


3000 


TGGGGGCCAG 


ggcttatacg 


3060 


CGAAGGGCCA 


GCTGTCCCCC 


3120 


AGTCAGCCCT 


TGGGGCAGCC 


3180 


CCGGGGTGGG 


CACGGTGCCC 


3240 


CCCTGGGGAC 


AGCCCCTCCT 


3300 


GCACAGGGGC 


TGCCCCCGGG 


3360 






3364 



(ii) MOLECULE TYPE: ocher nucleic acid 

(A) DESCRIPTION: /desc =s "DNA" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:10: 

GGTACCACTA CGGGTCTAGG CTGCCCATGT AAGGAGG CAA GGCCTGGGGA CACCCGAGAT 

GCCTGGTTAT AATTAACCCC AACACCTGCT GCCCCCCCCC CCCCAACACC TGCTGCCTGA 

GCCTGAGCGG TTACCCCACC CCGGTGCCTG GGTCTTAGGC TCTGTACACC ATGG 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 699 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: other nuclei c acid 
(A) DESCRIPTION: /desc = " DNA " 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

AGATCTTCCT AGAAATTCGC TGTCTGCGAG GGCCGGCTGT TGGGGTGAGT ACTCCCTCTC 60 

AAAAGCGGGC ATGACTTCTG CG CTAAG ATT GTCAGTTTCC AAAAACGAGG AGGATTTGAT 12 0 

ATTCACCTGG CCCGCGGTGA TGCCTTTGAG GGTGGCCGCG TCCATCTGGT CAGAAAAGAC 180 

AATCTTTTTG TTGTCAAGCT TGAGGTGTGG CAGGCTTGAG ATCGATCTGG CCATACACTT 240 

GAGTGACAAT GACATCCACT TTGCCTTTCT CTCCACAGGT GTCCACTCCC AGGTCCAACC 300 

GCGGATCTCC CGGGACCATG CCCAAGAAGA AGAGGAAGGT GTCCAATTTA CTGACCGTAC 360 

ACCAAAATTT GCCTGCATTA CCGGTCGATG CAACGAGTGA TGAGGTTCGC AAGAACCTGA 420 

TGGACATGTT CAGGGATCGC C AGG CGTTTT CTGAGCATAC CTGGAAAATG CTTCTGTCCG 4 80 
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TTTGCCGGTC GTGGGCGGCA TGGTGCAAGT TGAATAACCG GAAATGGTTT 
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CCCGCAGAAC 

CTGAAGATGT TCGCGATTAT CTTCTATATC TTCAGGCGCG CGGTCTGGCA GTAAAAACTA 

TCCAGCAACA TTTGGGCCAG CTAAACATGC TTCATCGTCG GTCCGGGCTG CCACGACCAA 

GTGACAGCAA TGCTGTTTCA CTGGTTATGC GGCGGATCC 

(2) INFORMATION FOR SEQ ID NO : 12 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 68 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "DNA" 

Jxi )• SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

G AAG ATCT AT AACTTCGTAT AATGTATGCT ATACGAAGTT ATTACCGAAG AAATGGCTCG 

AGATCTTC 

(2) INFORMATION FOR SEQ ID NO: 13: 

( i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 68 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = ”DNA" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

GAAGATCTCG AGCCATTTCT TCGGTAATAA CTTCGTATAG CATACATTAT AC G AAG TT AT 

AGATCTTC 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc = "DNA" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: . 

CCACATGTAT AACTTCGTAT AG CAT AC ATT ATACGAAGTT ATACATGTGG 

(2) INFORMATION FOR SEQ ID NO : 1 5 : 

( i } SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nuclei c acid 
(A) DESCRIPTION: /desc = "DNA" 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 5 : 

CCACATGTAT AACTTCGTAT AATGTATGCT ATACGAAGTT ATACATGTGG 



540 

600 

660 

699 



60 

68 



6 0 
68 



50 
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CLAIMS 

I • A recombinant plasmid, comprising in operable combination: 

a) a plasmid backbone, comprising an origin of replication, an 
antibiotic resistance gene and a eukaryotic promoter element: 

b) the left and right inverted terminal repeats (ITRs) of 
adenovirus, said ITRs each having a 5' and a 3’ end and arranged in a tail 
to tail orientation on said plasmid backbone: 

c) the adenovirus packaging sequence, said packaging sequence 
having a 5 and a 3‘ end and linked to one of said ITRs; and 

d) a first gene of interest operably linked to said promoter 
element. 

The recombinant plasmid of Claim I wherein the total size of said 
plasmid is between 27 and 40 kilobase pairs. 

rec °nibinant plasmid of Claim I. wherein said 5' end of said 
packaging sequence is linked to said 3' end of said left ITR. 

4. The recombinant plasmid of Claim 3. wherein said first gene of 
interest is linked to said _> end ot said packaging sequence. 

5. The recombinant plasmid of Claim 4. wherein said first gene of 
interest is the dystrophin cDNA gene. 

6. The recombinant plasmid of Claim 4, further comprising a second 
gene of interest. 

7 The recombinant plasmid of Claim 6. wherein said second gene of 
interest is linked to said 3‘ end of said right ITR. 

8. The recombinant plasmid of Claim 7. wherein said second gene of 
interest is a reporter gene. 
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9. The recombinant plasmid of Claim 8. wherein said reporter gene is 
selected from the group consisting of the E. coli p-galactosidase gene, the human 
placental alkaline phosphatase gene, the green lluroescent protein gene and the 
choramphenicol acetvltransferase gene. 






10. A mammalian cell line containing the recombinant plasmid of 
Claim 1. 



1 1. The cell line of Claim 10, wherein said cell line is a 293-derived cell 

line. 



12. A helper adenovirus comprising i) first and a second loxP sequences. 
10 and ii) the adenovirus packaging sequence, said packaging sequence having a 5' 

and a 3' end. 



13. The helper adenovirus of Claim 12. wherein said first loxP sequence 
is linked to the 5' end of said packaging sequence and said second loxP sequence is 
linked to said 3’ end of said packaging sequence. 

15 14. The helper adenovirus of Claim 12. further comprising at least one 

adenovirus gene coding region. 

13. A mammalian cell line, comprising: 

a) a recombinant plasmid, comprising, in operable combination: 

i) a plasmid backbone, comprising an origin of 

20 replication, an antibiotic resistance gene and a eukaryotic promoter 

element, 

ii) the left and right inverted terminal repeats (ITRs) of 
adenovirus, said ITRs each having a 5' and a 3’ end and arranged in 
a tail to tail orientation on said plasmid backbone. 

25 iii) the adenovirus packaging sequence, said packaging 

sequence having a 5' and a 3' end and linked to one ot said ITRs. 
and 
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iv) a first gene of interest operably linked to said 
promoter element: and 

b) a helper adenovirus comprising i) first and a second loxP 

sequences, and ii) the adenovirus packaging sequence, said packaging 
5 sequence having a 5‘ and a 3' end. 

16. The cell line of Claim 15. wherein said helper adenovirus further 
comprises at least one adenovirus gene coding region. 

17. The cell line of Claim 15. wherein said first gene of interest is the 
dystrophin cDNA gene 

1 8. A method of producing an adenovirus minichromosome, comprising: 
A) providing a mammalian cell line containing: 

a) a recombinant plasmid, comprising, in operable 
combination, i) a plasmid backbone, comprising an origin of 
replication, an antibiotic resistance gene and a eukarotic promoter 
element, ii) the left and right inverted terminal repeats (ITRs) of 
adenovirus, said ITRs each having a 5' and a 3* end and arranged in 
a tail to tail orientation on said plasmid backbone. Hi) the adenovirus 
packaging sequence, said packaging sequence having a 5* and a 3’ 
end and linked to one of said ITRs. and iv) a first gene of interest 
operably linked to said promoter element: and 

b) a helper adenovirus comprising i) first and a second 
loxP sec l uenc es. ii) at least one adenovirus gene coding region, and 
iii) the adenovirus packaging sequence, said packaging sequence 
having a 5 r and a 3‘ end: and 

B) growing said cell line under conditions such that said 
adenovirus gene coding region is expressed and said recombinant plasmid 
directs the production of at least one adenoviral minichromosome. 

19. The method of Claim 18. wherein said adenovirus minichromosome 
is encapidated. 
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20. The method ot Claim 18. further comprising 3) recovering said 
encapsidated adenovirus minichromosome. 



21. The method of Claim 20. further comprising 4) purifiving said 
recovered encapsidated adenovirus minichromosome. 



22. The method of Claim 21. further comprisng 5) administering said 
purified encapsidated adenovirus minichromosome to a host. 



23. The method of Claim 22. wherein said host is a mammal. 

24. The method of Claim 23, wherein said mammal is a human. 

25. A recombinant adenovirus comprising the adenovirus E2b region 

10 having a deletion, said adenovirus capable ot self-propagation in a packaging cell 

line and said E2b region comprising the DNA polymerase gene and the adenovirus 
preterminal protein gene. 

26. The recombinant adenovirus of Claim 25. wherein said deletion is 
within the adenovirus DNA polymerase gene. 

15 27. The recombinant adenovirus of Claim 25. wherein said deletion is 

within the adenovirus preterminal protein gene. 

28. The recombinant adenovirus of Claim 25. wherein said deletion is 
within the adenovirus DNA polymerase and preterminal protein genes. 

29. A mammalian cell line stably and constitutivelv expressing the 

20 adenovirus El gene products and the adenovirus DNA polymerase. 

30. The cell line of Claim 29. wherein said cell line comprises a 
recombinant adenovirus comprising a deletion within the E2b region, said 
recombinant adenovirus being capable of self-propagation in said cell line. 
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j 1. The cell line of Claim 30. wherein said deletion within said E2b 
region comprises a deletion within the adenoviral DNA polymerase gene. 



’ 7 ~ The Ce " of Ciaim 29 - w herein the genome of said cell line contains 
a nucleotide sequence encoding adenovirus DNA polymerase operably linked to a 
heterologous promoter. 



JJ. V The cell line of Claim 32. wherein said cell line is selected from the 
group consisting of the B-6. B-9. C-l. C-4. C-7. C-13. and C-14 ceil lines. 

j4. The cell line of Claim 29 further constitutively expressinc the 
adenovirus preterminal protein gene product. 



The cell line of Claim 34. wherein said cell line comprises a 
recombinant adenovirus comprising a deletion within the E2b region, said 
recombinant adenovirus being capable of seif-propagation in said cell line. 

j 6. The cell line of Claim 35. wherein said deletion within said E2b 
region comprises a deletion within the adenoviral preterminal protein gene. 

j 7. The cell line of Claim 36. wherein said deletion within said E2b 

region comprises a deletion within the adenovtral DNA polymerase and preterminal 
protein genes. 



38. . ..The cell line ot Claim 34. wherein the genome of said cell line 
contains a nucleotide sequence encoding adenovirus preterminal protein operably 
linked to a heterologous promoter. 



The cell line of Claim 38. wherein said cell line is selected from the 
group consisting of the C-l. C-4. C-7. C-13. and C-14 cell lines. 
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40. A method ot producing infectious recombinant adenovirus particles 
containing an adenoviral genome containing a deletion within the E2b region, 
comprising 



a) providing: 

5 i) a mammalian cell line stably and constitutively 

expressing the adenovirus El gene products and the adenovirus DNA 
polymerase: 

ii) a recombinant adenovirus comprising a deletion within 
the E2b region, said recombinant adenovirus being capable of self- 
10 propagation in said cell line: 

b) introducing said recombinant adenovirus into said cell line 
under conditions such that said recombinant adenovirus is propagated to 
form infectious recombinant adenovirus particles: and 

c) recovering said infectious recombinant adenovirus particles. 



15 41. The method of Claim 40, further comprising d) purifying said 

recovered infectious recombinant adenovirus particles. 



42. The method of Claim 41, further comprising e) administering said 
purified recombinant adenovirus particles to a host. 

43. The method of Claim 40, wherein said mammalian cell line further 

20 constitutively expresses the adenovirus preterminal protein. 

44. A recombinant plasmid capable of replicating in a bacterial host 
comprising adenoviral E2b sequences, said E2b sequences containing a deletion 
within the polymerase gene, said deletion resulting in reduced polymerase activity. 

45. The recombinant plasmid of Claim 44. wherein said deletion 

25 comprises a deletion of nucleotides 8772 to 9385 in SEQ ID NO:4. 

46. The recombinant plasmid of Claim 45. wherein said plasmid has the 
designation pApol. 
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47. The recombinant plasmid of Claim 45. wherein said plasmid has the 
designation pBHGl lApoI. 



48. A recombinant plasmid capable of replicating in a bacterial host 
comprising adenoviral E2b sequences, said E2b sequences contain.ng a deletion 
within the preterminal protein gene, said deletion resulting in the inability to 
express functional preterminal protein without disruption of the VA RNA eenes. 

49. The recombinant plasmid of Claim 48. wherein said deletion within 
said preterminal protein gene comprises a deletion of nucleotides 10.705 to I 1.134 
in SEQ ID NO:4. 

50. The recombinant plasmid of Claim 49. wherein said plasmid has the 
designation pApTP. 

51. The recombinant plasmid of Claim 49, wherein said plasmid has the 
designation pBHGllApTP. 

recombinant plasmid of Claim 48 further comprising a deletion 
within the polymerase gene, said deletion resulting in reduced polymerase activity. 

53. The recombinant plasmid of Claim 52. wherein said deletion within 
said polymerase and said preterminal protein genes comprises a deletion of 
nucleotides 8.773 to 9586 and 1 1.067 to 12.513 in SEQ ID NO:4. 



54. The recombinant plasmid of Claim 53. wherein said plasmid has the 
designation pAXBApolApTPVARNA+ti3. 

55. The recombinant plasmid of Claim 53. wherein said plasmid has the 
designation pBHG 1 1 ApolApTPVARNA+t 1 3. 
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